Redundant Computing for Security David Evans University of Virginia

Redundant Computing for Security

David EvansUniversity of VirginiaTRUST SeminarUC Berkeley25 September 2008Work with Ben Cox, Anh Nguyen-Tuong, Jonathan Rowanhill, John Knight, and Jack Davidson

The Basic IdeaInput(Possibly Malicious)Server Variant 0ServerVariant1MonitorOutputAttacker must find one input that compromises both variants

IEEE Transactions on Computers, Jan 1968

Nevil Maskelyne5th English Astronomer Royal, 1765-1811Image: National Maritime Museum, London

Image: Michael Daly, Wikimedia Commons

Maskelynes Redundant ComputingData DiversityInputAnti-ComputerComputerComparerData for computing positions at noonData for computing positions at midnight

Babbages ReviewI wish to God these calculations had been executed by steam.Charles Babbage, 1821

...back to the 21st century (and beyond)Moores Law: number of transistors/$ increases exponentiallyEinsteins Law: speed of light isnt getting any fasterEastwood/Turing Law: If you want a guarantee, buy a toaster.Suttons Law: Because thats where the money is.Conclusion: CPU cycles are becoming free, but vulnerabilities and attackers arent going away

Security Through DiversityAddress-Space Randomization [Forest+ 1997, PaX ALSR 2001, Bhatkar+ 2003, Windows Vista 2008]Instruction Set Randomization[Kc+ 2003, Barrantes+ 2003]DNS Port RandomizationData Diversity

Limitations of Diversity TechniquesWeak security assurancesProbabilistic guaranteesUncertain what happens when it worksNeed high-entropy variationsAddress-space may be too small [Shacham+, CCS 04]Need to keep secretsAttacker may be able to incrementally probe system [Sovarel+, USENIX Sec 2005]Side channels, weak key generation, etc.

N-Variant System FrameworkPolygrapherReplicates input to all variantsVariantsN processes that implement the same serviceVary property you hope attack depends on: memory locations, instruction set, system call numbers, calling convention, data representation, Variant 0Variant1MonitorPoly-grapherMonitorObserves variantsDelays external effects until all variants agreeInitiates recovery if variants divergeNo secrets, high assurances, no need for entropy

N-VersionProgramming[Avizienis & Chen, 1977] Multiple teams of programmers implement same specificationVoter compares results and selects most common

No guarantees: teams may make same mistakeTransformer automatically produces diverse variants

Monitor compares results and detects attack

Guarantees: variants behave differently on particular input classes N-VariantSystems

Variants RequirementsDetection PropertyAny attack that compromises one variant causes the other to crash (behave in a way that is noticeably different to the monitor)Normal Equivalence PropertyUnder normal inputs, the variants stay in equivalent states:A0(S0) A1(S1)Actual states are different, but abstract states are equivalent

Opportunity for VariationAll Possible InputsMalicious InputsInputs with Well-Defined BehaviorCant change well-defined behavior, but can change undefined behavior

Disjoint VariantsMalicious InputsVariant 0Variant 1Inputs with Well-Defined BehaviorBehaviorMalicious InputsInputs with Well-Defined Behavior

Example: Address-Space PartitioningVariationVariant 0: addresses all start with 0Variant 1: addresses all start with 1Normal EquivalenceMap addresses to same address spaceAssumes normal behavior does not depend on absolute addressesDetection PropertyAny injected absolute load/store is invalid on one of the variants

Example: Instruction Set TaggingVariation: add an extra bit to all opcodesVariation 0: tag bit is a 0Variation 1: tag bit is a 1Run-time: check and remove bit (software dynamic translation) Normal Equivalence: Remove the tag bitsAssume well-behaved program does not rely on its own instructionsDetection PropertyAny (tagged) opcode is invalid on one variantInjected code (identical on both) cannot run on both

Data Diversity[Amman & Knight, 1987]and [Maskelyne 1767]Re-expression functions transform data representationInverse transformationsR0R1R0-1R1-1PPInputOutput

Data Diversity in N-Variant SystemsR0R1R0-1R1-1PPInputOutputTrusted Data

Variant 0

Variant 1

MonitorUntrusted Input

UID Corruption Attacksuid_t user;...user = authenticate();...setuid(user);Attacker corrupts userExamples in [Chen+, USENIX Sec 2005]Goal: thwart attacks by changing data representation

UID Data Diversityroot: 0bin: 1nobody:99root: 0x7FFFFFFFbin: 0x7FFFFFFEnobody:0x7FFFFF9CR0(u) = uR0-1(u) = uR1(u) = u 0x7FFFFFFFR1-1(u) = u 0x7FFFFFFFIdentity Re-expressionFlip Bits Re-expressionVariant 0Variant 1

Data Transformation RequirementsNormal equivalence:x: T, Ri-1(Ri(x)) = xAll trusted data of type T is transformed by RAll instructions in P that operate on data of type T are transformed to preserve original semantics on re-expressed dataDetection:x: T, R0-1(x) R1-1(x)) (disjointedness)

Ideal ImplementationPolygrapherIdentical inputs to variants at same timeMonitorContinually examine variants completelyVariantsFully isolated, behave identically on normal inputsInfeasible for real systems

Framework ImplementionModified Linux 2.6.11 kernelRun variants as processesCreate 2 new system callsn_variant_forkn_variant_execveReplication and monitoring by wrapping system calls

V0V1V2KernelHardware

Wrapping System CallsAll calls: check each variant makes the same callI/O system calls (process interacts with external state) (e.g., open, read, write)Make call once, send same result to all variantsReflective system calls (e.g, fork, execve, wait)Make call once per variant, adjusted accordinglyDangerous Some calls break isolation (mmap) or escape framework (execve)Current solution: disallow unsafe calls

*sys_write_wrapper(int fd, char __user * buf, int len) { if (!IS_VARIANT(current)) { perform system call normally } else { if (!inSystemCall(current->nv_system)) { // First variant to reach Save Parameters Sleep Return Result Value } else if (currentSystemCall(current->nv_system) !=SYS_WRITE) { DIVERGENCE different system calls } else if (!Parameters Match) { DIVERGENCE different parameters } else if (!isLastVariant(current->nv_system)) { Sleep Return Result Value } else { Perform System Call Save Result Wake Up All Variants Return Result Value } }}

Implementing VariantsAddress Space PartitioningSpecify segments start addresses and sizesOS detects injected address as SEGVInstruction Set TaggingUse Diablo [De Sutter+ 03] to insert tags into binary Use Strata [Scott+ 02] to check and remove tags at runtime

Implementing UID VariationAssumptions:We can identify UID data (uid_t, gid_t)Only certain operations are performed on it:Assignments, Comparisons, Parameter passingProgram shouldnt depend on actual UID values, only the users they represent.

Code TransformationRe-express UID constants in code

Preserve semanticsFlip comparisonsFine-grained monitoring: uid_t uid_value(uid_t), bool check_cond(bool)External Trusted Data (e.g., /etc/passwd)if (!getuid()) if (getuid() == 0)

if (getuid() == 0x7FFFFFFF) R1

Re-expressed Filesfopen(/etc/password);

root:0:...bin: 1:......nobody: 99:...fopen(/etc/password);Variant 0Variant 1fopen wrapper/etc/password-0

root:0x7FFFFFFF:...bin: 0x7FFFFFFE:......nobody: 0x7FFFFF9C:.../etc/password-1Variant-specific kernel file table to support both shared (normal) and re-expressed files

Thwarting UID CorruptionVariant 0

Variant 1

R1(x)R1-1(x)R0-1(x)R0(x)=

Poly-grapherInjected UID: x: T, R0-1(x) R1-1(x)) detected

Results5.8116.326.5637.366.6538.49010203040UnsaturatedSaturatedUID Data VariationAddress-PartitioningUnmodifiedApache 1.3 on Linux 2.6.11(1 WebBench client)(5 hosts * 6 each WebBench clients)14% increase in latency (13% decrease in throughput)136% increase in latency (58% decrease in throughput)

Open Problems and OpportunitiesDealing with non-determinismMost sources addressed by wrapperse.g., entropy sources...but not multi-threading [Bruschi, Cavallero & Lanzi 07]Finding useful higher level variationsNeed specified behaviorOpportunities with higher-level languages, web application synthesizersClient-side usesGiving variants different inputsCharacter encodings

Related WorkDesign DiversityHACQIT [Just+, 2002], [Gao, Reiter & Song 2005]Probabilistic VariationsDieHard [Berger & Zorn, 2006]Other projects exploring similar frameworks[Bruschi, Cavallaro & Lanzi 2007], [Salamat, Gal & Franz 2008]

http://www.cs.virginia.edu/nvariant/Supported by National Science Foundation Cyber Trust Program and MURIPapers: USENIX Sec 2006, DSN 2008Collaborators: Ben Cox, Anh Nguyen-Tuong, Jonathan Rowanhill, John Knight, Jack Davidson

Backup Slides

Using Extra Cores for SecurityDespite lots of effort:Automatically parallelizing programs is still only possible in rare circumstancesHuman programmers are not capable of thinking asynchronouslyMost server programs do not have fine grain parallelism and are I/O-boundHence: lots of essentially free cycles for security

*NetFlix prize of the 1700s***Here is an example of the write system call wrapper. When the first variant reaches a system call, it will mark that it entered the system call, save the parameters, then go to sleep. When the other variants enter the system call, they will compare their parameters with the first variants. Then if they arent the last variant to reach the system call theyll go to sleep. When the last variant reaches the system call it will perform the write system call and save the return value. It would then wake up the other variants and return from the system call. The other variants will then get the return value and return. If either variant tries to execute different system calls or system calls with different parameters, it will shut down the system. *So, to recap; (1) Artificial Diversity only has limited assurance of stopping attacks and relies on keeping secrets. (2) The N-Variant system builds upon artificial diversity, running multiple artificially diversified variants in parallel ensuring all variants behave identically. (3) We developed a model for reasoning about the security properties on the N-Variant System framework, and showed that with carefully constructed variants we can provide provable security whose defense does not rely on keeping secrets. In addition, previously diversity techniques without sufficient entropy were ignored due to their security limitations. This framework also opens up possibilities for these diversity techniques as well. Finally, we modified the Linux kernel to monitor the variants and evaluated its performance at the system call level.

Redundant Computing for Security David Evans University of Virginia

Documents

Transcript of Redundant Computing for Security David Evans University of Virginia