Arun Sood George Mason University/Computer Science Task Technologies Ltd [email protected], 703.347.4494...
-
Upload
opal-turner -
Category
Documents
-
view
215 -
download
0
Transcript of Arun Sood George Mason University/Computer Science Task Technologies Ltd [email protected], 703.347.4494...
Arun SoodGeorge Mason University/Computer Science
Task Technologies Ltd [email protected], 703.347.4494
SCIT Collaborators: Dr Yih Huang, Mr. David Arsenault, Mr. Ravi Bhaskar and Mr. Jeffrey Zeiberg
http://cs.gmu.edu/~asood/scit
Research supported by TATRC (US Army), NIST funded Critical Infrastructure Protection Program, SUN Microsystems
Self Cleansing Intrusion Tolerance: An Approach for Increasing Security and
Availability
2
Background and Overview
Self Cleansing Intrusion Tolerance (SCIT)• Limits the exposure of servers to attacks.• Uses redundancy to reduce exposure.• Does *not* rely on signatures, prior knowledge or
intrusion detection. Development process
• Prototypes built for Firewall, Web Server and DNS Server. Working on LDAP.
• Three patent applications are pending. One disclosure filed.
• SCIT research funded by US Army (TATRC), NIST funded Critical Infrastructure Protection Project, and SUN Microsystems.
3
Review of Intrusion Management
Intrusion Prevention.• Block intrusions.• Common security practices, such as blocking unused
ports, restricting Server privileges, choosing strong passwords, …
Intrusion Detection and Recovery.• Timely detection of intrusion to stop losses and repair
damage.• Processing overhead grows with traffic volume.• A large site will have to use a significant portion of the
processing power for intrusion detection.• False alarm management.
4
Review - 2
Intrusion Tolerance.• Minimizes losses and facilitates automatic recovery.
• Addresses undetected intrusion (the worst-case scenarios).
SCIT: Self Cleansing Intrusion Tolerance.• Security thru server cleansing and rotations.• Effectiveness increases with server redundancy.• Reduces exposure window.• Fends off attacks or at least limits losses.• Makes it difficult to exploit vulnerabilities.• Overhead independent of traffic volume.
5
Project Summary
Our objective is to develop a secure cluster architecture that• Applies to several critical and/or infrastructure applications. • Manages undetected attacks.• Improves security through increasing server redundancy.
• Improvements in security can be “bought”.
• Hardware redundancy enhances security and availability.• Uses virtualization technologies.
6
Application Domains: Examples
Our work is expected to suit server clusters that • Process transactions with no inter transaction
dependencies. • Can handle session info. State info is more difficult.
• Within reasonable time limits.
Examples: DNS servers, Web servers, Directory servers, File servers, DHCP servers, Authentication servers, Back office servers, Transaction-oriented database servers, …
7
Application Domains: Restrictions
Our current approach does not address the following:• Media streaming servers.• Telnet/ssh servers.• FTP servers, or any server that supports long data
downloads/uploads.• Essentially, long sessions without time constraints.
The solutions to the above are part of our longer term research goals.
8
Overview
Research objectives Self-Cleansing Intrusion Tolerance (SCIT) concept Hardware Enhanced Security (SCIT – HES) DNS – SCIT server Cluster security vs availability trade-off Virtualization
9
Redundancy + SCIT = Availability + Security
SCIT = Self Cleansing Intrusion Tolerance
To Achieve
10
SCIT Server RotationsExample: 5 online and 3 offline servers
Onlineservers;
potentiallycompromised
Server Rotation
Offlineservers; inself-cleansing
11
SCIT Server RotationsExample: 5 online and 3 offline servers
Onlineservers;
potentiallycompromised
Server Rotation
Offlineservers; inself-cleansing
12
SCIT Server RotationsExample: 5 online and 3 offline servers
Onlineservers;
potentiallycompromised
Server Rotation
-No Server service interruption.-For DNS, 2 second exposure time using SUN server.
Offlineservers; inself-cleansing
13
Exposure Window
SCIT increases security by reducing exposure window. Exposure window is the time a server is online. SCIT target - keep the exposure window below T, a client
defined requirement.
Loss Curve
Intruder Res idence Tim e
Lo
ss
T
T
Co
st
14
Overview
Research objectives Self-Cleansing Intrusion Tolerance (SCIT) concept Hardware Enhanced Security (SCIT – HES) DNS – SCIT server Cluster security vs availability trade-off Virtualization
15
SCIT Primitives: Incorruptibility Requirements
SCIT server rotations should not be disrupted.
Online servers connected to the clients (public Internet), but not the central controller.
Offline/cleansing servers connected to controller, but not to the clients.
The controller and cleansing servers always isolated from external influences.
16
Reconfiguration Network Paths using HES
SC
IT C
entr
al
Con
trol
ler
Clients
reset
toggle
Clients
Online
Offline
17
SCIT with Hardware Enhanced Security (HES)
SCITControl
Clients
Clients
Clients
Clients
18
Implementation Considerations
IPMI (Intelligent Platform Management Interface) supports power management (power up/down) and remote reset/reboot.
Many managed Ethernet switches can enable/disable ports individually and dynamically.
Or customized hardware.
19
Overview
Research objectives Self-Cleansing Intrusion Tolerance (SCIT) concept Hardware Enhanced Security (SCIT – HES) DNS – SCIT server Cluster security vs availability trade-off Virtualization
20
Secure DNS
Domain Name
IP Address (type A)
Nominal Name
Nominal Name
Nominal Name
Nominal Name
Signature Resource Record
Null record
Objective: Enhancements to enable query clients to authenticate DNS reply.
Zone private key is used to sign the DNS record.• Zone private key exposure will
comprise integrity of the DNS responses
Query clients verify signature using zone public key
If DNS server is successfully attacked, there are dire consequences
Resource Record Set
21
Dynamic Updates: Key exposed
For secure operations, private key should not be reachable from the public internet.
The Challenge:
How can we do dynamic updates without exposing the private key?
The Solution:
Do Dynamic Update computations off-line. Trade-off:
Turnaround time for Dynamic Updates increases.
22
DNS-HES Cluster
Advertises two public IP addresses• A primary DNS IP.• A secondary DNS IP.
Uses four or more servers• A primary DNS server.• A secondary DNS server.• A backend processing server.• One or more servers in cleansing.
23
Example: DNS-HES Cluster
Online 0
Online 1
One ormoreserversin cleansing(mode C)
P
S
BClients
Clients
Network link
Electrical/Opticalsignal line
MasterStorage
(Master file, keys)
Cen
tral
C
ontr
olle
r
24
The Primary Server Connects to one Online Storage and Clients
P ClientsSCIT
Controller
Online 0 Online 1
MasterStorage
LocalMaster file
25
Overview
Research objectives Self-Cleansing Intrusion Tolerance (SCIT) concept Hardware Enhanced Security (SCIT – HES) DNS – SCIT server Cluster security vs availability trade-off
Example: 2DNS-3WEB Cluster Virtualization
26
Session Persistence: Sticky Sessions
User 1 User 2 User 3 User N
Load Balancer Traffic Distribution, Session Persistence, SSL Termination
Server 1 Server 2 Server 3 Server N
27
Server Clusters
User 1 User 2 User 3 User N
Load Balancer Traffic Distribution, Session Persistence, SSL Termination
Server 1 Server 2 Server 3 Server N
Persistent db • Disk, memory resident.MulticastingShared memory
28
Virtual Server Implementation: Session Replication
User 1 User 2 User 3 User N
Load Balancer Traffic Distribution, Session Persistence, SSL Termination
Server 1 Server 2 Server 3 Server N
VirtualServer
1
VirtualServer
2
VirtualServer
3
Virtual servers are short lived.
Persistent db is easy. Multi casting
• Additional network traffic.
• Reduce traffic through smaller clusters.
Shared memory is not recommended.
29
Unified SCIT Control forSecurity and Availability
Ensure SCIT operations.
Guarantee minimum service availability.
Servers can be added to or removed from the cluster at any time.
Adding servers improves both.
• Availability.
• Security, by reducing server exposure times.
30
Generalization: N servers and M roles
The DNS-HES design can be generalized to a SCIT cluster using N servers to support M roles.
Without loss of generality, each role is assumed by only one server.
Example:• A 2DNS-3WEB cluster supports M=5 roles: P, W, W,
W, and S.• The primary DNS server (P) and one web server (W)
provides minimum (but still useful) services
31
Role Swap
A Role-R Swap (or simple R Swap) rotates the present online server in Role R offline for cleansing and finds a newly cleansed server to resume role R.
The 2DNS-3WEB cluster supports P, W, W, W, and S swaps.
Example of P-swap.
P
C
32
Role Swap
A Role-R Swap (or simple R Swap) rotates the present online server in Role R offline for cleansing and finds a newly cleansed server to resume role R.
The 2DNS-3WEB cluster supports P, W, W, W, and S swaps.
Example of P-swap.
P
C
P
P
C
33
Rotation Pattern
A rotation pattern P is a sequence of role swaps that covers all M roles.
Example 1: Round Robin patternPWWWS
Example 2: Skewed patternPWWPWS
Example 3: with randomnessPXWXPXWXWXS
X {P, W, W,W,S}
34
Relative Importance of Roles
Each service role R has an index, (R), that reflects its relative importance. (We use integers.)
Without loss of generality, the indices are assumed in descending order
2DNS-3Web:
(P)(W)10, and (W)(W)(S)0.
35
Cluster Index
The SCIT cluster index at time t, is denoted by (t) - it is the sum of the indices of those roles that are active at time t.
The value of a SCIT cluster must be at all times greater than or equal to a predetermined minimum value, min.
2DNS-3WEB: min=20
36
Minimum Number of Servers
Let Nmin be the minimum number of servers required to achieve min
• Nmin = 2 in 2DNS-3WEB.
When N<Nmin, behavior undefined.
When N=Nmin
• N servers servicing the most important N roles.
• No server rotations/swaps.
37
Servers in Cleansing
When MN>Nmin (more roles than servers).• N-1 most important roles active. • At least one server in cleansing to trigger rotations.• Virtualization ?
When N>M (more servers than roles).• M servers servicing M roles.• NM servers in cleansing and for rotations.• The more servers in cleansing, the faster the rotations, and the
shorter the server exposure times.
38
Number of Servers
# of servers Active roles Rotation
1 Undefined 10 No
2 P, W 20 No
3 P, W 20 Yes
4 P, W, W 20 Yes
5 P, W, W, W 20 Yes
6 P, W, W, W, S
20More spare servers lead to
faster server rotations and shorter server exposure times
39
Provable Properties
The proposed SCIT control algorithm satisfies the following properties:• Server Rotation Guarantee.
With arbitrary server failures, server rotations continue if there are NNmin+1 functioning servers in the cluster.
• Minimum Service Guarantee. With arbitrary server failures, min is maintained at all times (after an adjustment period) if there are NNmin functioning servers in the cluster.
40
Effect of Server Redundancy Server Exposure Times
0
1
2
3
4
5
1 2 3 4 5 6 7 8 9 10
number of spare servers
Exp
osu
re T
ime
s in
SC
T
All roles
41
Effect of “Skewed” RotationPWWPWS
0
1
2
3
4
5
1 2 3 4 5
number of spare servers
Exp
osur
e T
imes
in S
CT
All roles P-DNS Others
42
Overview
Research objectives Self-Cleansing Intrusion Tolerance (SCIT) concept Hardware Enhanced Security (SCIT – HES) DNS – SCIT server Cluster security vs availability trade-off
Example: 2DNS-3WEB Cluster Virtualization
43
Virtualization in Support of Security
Virtualization was motivated by server consolidation. Intensive ongoing research to use virtualization to enhance
security. Features that would help (a researcher’s wish list).
• Fast creation of new Virtual Server (VS) on the fly.• VS forking like in process forking.
• Checkpoints and fast reverts.• Snapshot only the memory in use, not the entire VS memory (as in the
case of VMware snapshots).
• Efficient sharing of resources.• Memory and disk: copy on write among identically configured VS.
• Ultimate goal is single use VS.
44
SCIT: Virtualization
Importance for SCIT SUN and Virtuozzo VMware and Xen
Guest OS kernel
Impacts diversity. VS and host share OS kernel.
Each VS has its own kernel - potentially different than host OS.
Diversity of Configuration.
Increases security. Choice of applications.
Process memory image.
Guest OS.
Choice of applications. Process memory image.
Virtualization overhead.
Throughput of VS. Lower. Higher.
Potential cleansing technology.
Faster reuse. Reboot and system integrity checks.
Snapshots and revert.
Efficient resource sharing.
More VS. Less exposure time.
Good for processor and memory sharing.
Good for processor sharing, poor for memory sharing.
45
Current Research
SCIT/VS: SCIT with Virtual Servers.• Take advantages of virtualization features for virtual
server rotations and cleansing.• Operates at much faster speed than hardware rotations,
resulting in much smaller exposure windows.• A SCIT/VS+Apache testbed is being built.
Initial results.• 10% of SCIT operational overhead.• VS cleansing every 30 seconds.
A promising combination:Solaris+FireEngine+Xen.• Should reduce the overall overhead significantly.
46
Conclusions
Incorruptible intrusion tolerance through physical isolation.
Connecting security and service availability. Many critical infrastructure services use
redundancy for availability / dependability. Many hardware components are included in
modern high availability systems.
47
SCIT Publications + Contact Info
Current research focuses on combining SCIT with virtualization technology to drastically reduce server exposure times.
SCIT papers are available at http://cs.gmu.edu/~asood/scit
Arun Sood
703.993.1524