Survivability Analysis of Networked Systems
description
Transcript of Survivability Analysis of Networked Systems
Survivability Analysis of Networked Systems
Computer Science DepartmentCarnegie Mellon University
Pittsburgh, PA
DARPA OASIS PI Meeting, Norfolk, VA14 February 2001
Jeannette M. Wingwith Somesh Jha (Wisconsin) and Oleg Sheyner
DARPA co-PI: Tom Longstaff (SEI)
2Survivability Analysis Jeannette M. Wing
Survivability
• What if– a terrorist hacker brings down the nation’s power
grid?– an act of Mother Nature causes the US banking
network to fail?
• Critical infrastructures– Utilities: gas, electricity, nuclear, water, …– Communications: telephone, networks, …– Transportation: airlines, railways, highways, …– Medical: emergency services, hospitals, …– Financial: banking, trading, …
3Survivability Analysis Jeannette M. Wing
Survivability
• A system is survivable if it can continue to provide end services despite the presence of faults.
• Faults– Accidental or malicious– Not necessarily independent Finer-grained reliability analysis is required.
• Service-oriented– Exploit semantics of application Not all network nodes and links are treated equally.
4Survivability Analysis Jeannette M. Wing
Foundational Questions
• What is the difference between models for survivability and those for– Fault-tolerant distributed systems?– Secure systems?
• Our starting point:– Independence assumption goes out the window.– Cost must be included in the equation.
5Survivability Analysis Jeannette M. Wing
Determining Survivability Strategies
System Requirements/Architecture
SurvivableNetwork Analysis
Essential ServicesIntrusion EffectsMitigation Strategies
SEI CERT/CCIntrusionKnowledge
ImprovedRequirements/Architecture
6Survivability Analysis Jeannette M. Wing
Two Parts in Cooperation
• The Survivable Network Analysis Method (SEI)– Measures existing systems for survivability– Focuses on user and intruder models
• Applying Formal Methods to SNA– Applies model checking and other techniques to
survivability– Allows systems that are formally specified to be
submitted to survivability analysis
8Survivability Analysis Jeannette M. Wing
Simple Example: A Banking System
FRB 1
FRB 3FRB 2
MC 1 MC 3MC 2
Bank A Bank CBank B
a1a2
b1 b2 c1
9Survivability Analysis Jeannette M. Wing
Overview of Our Formal Method
Checker
Network Model Survivability Property
Phase 1
Scenario Graph
Analyzer
Reliability Query,Cost Query, etc.Phase
2
Scenario Set
10Survivability Analysis Jeannette M. Wing
Phase 1
Network Model =
Survivability Property =
Scenario Graph =
Model Checker = (modified) NuSMV
A set of concurrently executing Finite State Machines.
A predicate in CTL.
A set of related examples.
11Survivability Analysis Jeannette M. Wing
Network Model
• Processes– Nodes and links are processes (i.e., FSMs)
• banks, money centers, federal reserve banks, and links
– Communication via shared variables (i.e., finite queues)
• representing channels, and hence interconnections.
• Failures– Faults represented by special state variable
• fault:{normal, failed, intruded}
– Links and banks can fail at any time• Failed link blocks all traffic.• Failed bank routes all checks to an arbitrarily chosen
money center.
– Money centers and federal reserve banks do not fail.
12Survivability Analysis Jeannette M. Wing
Survivability Properties
• Fault-related– Money never deposited into wrong account.
• AG(error)
• Service-related– A check issued eventually clears.
• AG(checkIssued AF(checkCleared))
14Survivability Analysis Jeannette M. Wing
Output: Fault Scenario Graph
Intuition:
• Each “counterexample” spit out by the model checker is a scenario.
• Survivability property gives a slice of the model.
Each path is a scenario of how a fault can occur.
15Survivability Analysis Jeannette M. Wing
Survivability Properties
• Fault-related– Money never deposited into wrong account.
• AG(error)
• Service-related– A check issued eventually clears.
• AG(checkIssued AF(checkCleared))
16Survivability Analysis Jeannette M. Wing
A Service Success Scenario Graph
issueCheck(A, C)
send(A, MC-2)
send(MC-2, FRB-1)
send(FRB-1, FRB-3)
send(FRB-3, MC-3)
send(MC-3, C)
debitAccount
send(FRB-2, FRB-3)
send(MC-1, FRB-2)
send(A, MC-1)
up(a2)
up(c1)
down(a2) & up(a1)
17Survivability Analysis Jeannette M. Wing
A Service Fail Scenario Graph
issueCheck(A, C)
FAIL
down(A)
up(a2)
up(A)
pick(MC-2)
down(c1)
down(a2)
pick(MC-1)
down(a1)
down(c1)
down(c1)
up(a1)
send(A, MC-2) send(A, MC-1)
FAIL
FAIL
FAIL
18Survivability Analysis Jeannette M. Wing
Overview of Method
Network Model
Reliability Query,Cost Query, etc.Analyzer
Scenario Set
Survivability Property
Phase 2
Phase 1
Scenario Graph
Checker
Annotations(e.g., probabilities, cost)
19Survivability Analysis Jeannette M. Wing
Phase 2: Reliability Analysis (in a Nutshell)
• Annotations = Probabilities– Use Bayesian Networks to model dependence of
events.
• Symbolic– Use symbolic probabilities
• high, medium, low
– Use NDFA theory to compute scenario set.
• Continuous– Use numeric probabilities
• [0.0, 1.0]
– Use Markov Decision Processes to model both nondeterministic and probabilistic transitions.
20Survivability Analysis Jeannette M. Wing
Phase 2a: Symbolic Analysis
Annotated Scenario Graph =
Reliability Query =
Scenario Set =
Composer = ASG + DFA
Bayesian Network + Scenario Graph
Regular Expression (DFA)
High-risk scenarios
high
{medium, low}
23Survivability Analysis Jeannette M. Wing
Annotated Scenario Graph
issueCheck(A, C)
down(A)
up(a1)
up(A)
pick(MC-2)
down(a1)
pick(MC-1)
down(a2)down(a2)
FAIL
M
M
M
M
M
M
L H
24Survivability Analysis Jeannette M. Wing
Phase 2b: Continuous Analysis
• Use real values for probabilities.• May leave probabilities of some events
unspecified. Markov Decision Processes
• Mix of nondeterministic and probabilistic transitions
• Why? System is not closed.– Hard to assign probabilities to some faults (e.g.,
intrusions).– Environment makes choice (i.e., decision) and can be
demonic!
25Survivability Analysis Jeannette M. Wing
Reliability Analysis
Goal of (malicious) environment: Devise an optimal policy to minimize reliability.
• Assign to each state, s, a value, V(s), computed using a standard policy iteration algorithm from MDP literature.
• Let V* be the value function after convergence. Then, for initial state of scenario graph, s0, V*(s0) computes worst-case probability of service eventually finishing.
26Survivability Analysis Jeannette M. Wing
A Typical Example
V(Bad) = 0.0V(Good) = 1.0
0.6
0.7
0.6
BadGood
0.40.
60.3
0.7
0.65
27Survivability Analysis Jeannette M. Wing
A Service Success Scenario Graph
issueCheck(A, C)
send(A, MC-2)
send(MC-2, FRB-1)
send(FRB-1, FRB-3)
send(FRB-3, MC-3)
send(MC-3, C)
debitAccount
send(FRB-2, FRB-3)
send(MC-1, FRB-2)
send(A, MC-1)
up(a2)
up(c1)
down(a2) & up(a1)3/8 1/4
1/2The worst case probability that a check issued by Bank A on Bank C is
(1/2 * 3/8) + (1/2 * 1/4) = 5/16
28Survivability Analysis Jeannette M. Wing
Phase 2c: Latency and Cost Analysis
• Latency Analysis– Associate with each edge in scenario graph an
immediate cost (e.g., time it takes to execute event).– Q: What is the worst case latency scenario?
• Cost Analysis– Identify new actions that correspond to decisions an
architect needs to make.– Associate a cost with each action.– Define constraints on costs.– Q: Which set of links can I afford to upgrade to
achieve higher reliability, given my cost constraints?
31Survivability Analysis Jeannette M. Wing
Constrained Markov Decision Processes
<S, A, P, c, d>
• S is a finite state space.• A is a finite set of actions.
• P are transition probabilities. Psas’ is the probability of moving from state s to s’ if action a is chosen.
• c: (S x A) is the immediate cost. c(s, a) is the cost of choosing action a at state s.
• d: (S x A) is a k-dimensional vector of immediate costs, captures additional cost constraints.
32Survivability Analysis Jeannette M. Wing
Progress To Date: Tools
• Trishul tool• Uses NuSMV model checker, done by Somesh Jha.• History variable explodes state space, leading to…
• …New tool• Uses SPIN, ongoing by Oleg Sheyner.• No need for history variable.
33Survivability Analysis Jeannette M. Wing
Progress To Date: Case Studies
• Trading floor model of major investment bank (being “sanitized”)– 10K lines of NuSMV– half-million nodes in scenario graph– 50 threat scenarios– 45 found by system– 5 new threat scenarios found– With independence assumption, too many misses.
• B2B e-commerce NYC start-up (Jha)– 50K lines of Statecharts– 2 million NuSMV beyond capability of tool
• Lincoln Labs example (Sheyner)– TBD
34Survivability Analysis Jeannette M. Wing
Next Steps
• Show applicability of the CMDP model for other critical infrastructure examples.
• Via Lincoln Labs connection
• Combine with other tools to further automate the analysis.
• Linear programming package, theorem provers, …
• Integrate with informal SEI Survivability Network Analysis Method
• Via case studies
35Survivability Analysis Jeannette M. Wing
References
• Applying Formal Methods to SNA– Jha and Wing, “Survivability Analysis of Networked
Systems,” to appear, Proceedings of the International Conference of Software Engineering, 2001.
• Survivable Networking Analysis– Ellison, Linger, Longstaff, and Mead, “Survivable Network
System Analysis: A Case Study,” IEEE Software, July/August 1999.
• The Vigilant Healthcare System– Ellison, Fisher, Linger, Lipson, Longstaff, and Mead,
“Survivability: Protection Your Critical Systems, IEEE Internet Computing, November/December 1999.
– Web site: IEEE article and other reports www.sei.cmu.edu/organization/programs/nss/surv
-net-tech.html
36Survivability Analysis Jeannette M. Wing
Other OASIS Connection
• Recovery service for PASIS (Greg Ganger, Pradeep Khosla, Carnegie Mellon)
– Anticipate intrusion• Proactive secret-sharing
– Upon intrusion detection• Reactive secret-sharing