Unleashing Mayhem on Binary Code Sang Kil Cha Thanassis Avgerinos Alexandre Rebert David Brumley...
-
Upload
marion-hoover -
Category
Documents
-
view
224 -
download
5
Transcript of Unleashing Mayhem on Binary Code Sang Kil Cha Thanassis Avgerinos Alexandre Rebert David Brumley...
Unleashing Mayhem on Binary Code
Sang Kil ChaThanassis Avgerinos
Alexandre RebertDavid Brumley
Carnegie Mellon University
2
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
AEG
Program Exploits
I = input();if (I < 42) vuln();else safe();
3
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
Explore Program
4
Ghostscript v8.62 Bugint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
Reading user input from command line
Buffer overflow
CVE-2009-4270
5
Multiple Pathsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
ManyBranches!
6
Automatic Exploit Generation Challenge
Automatically Find Bugs & Generate Exploits
Transfer Control to Attacker Code
(exec “/bin/sh”)
7
Generating Exploitsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
8
Generating Exploitsint outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) { switch ( arg[0] ) { case ‘-’: { switch ( arg[1] ) { case 0: … default: outprintf( “unknown switch %s\n”, arg[1] ); } } default: … } …
Read Return Address from Stack Pointer (esp)
8
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
Control Hijack Possible
9Source
int main( int argc, char* argv[] ){ const char *arg; while( (arg = *argv++) != 0 ) {…
Executables (Binary)
01010010101010100101010010101010100101010101010101000100001000101001001001001000000010100010010101010010101001001010101001010101001010000110010101010111011001010101010101010100101010111110100101010101010101001010101010101010101010
Unleashing Mayhem
Automatically Find Bugs & Generate Exploitsfor Executables
11
if x < 100
if x*x = 0xffffffff
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x != 0xffffffff)
(x > 42) ∧ (x*x != 0xffffffff)
∧ (x >= 100)
f t
f t
f t
12
f t
f t
f t
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
if x*x = 0xffffffff
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)
if x < 100
13
f t
f t
f t
x = input()
if x > 42
if x*x = 0xffffffff
vuln()
Path Predicate = Π
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)Π =
if x < 100
14
f t
f t
f t
x = input()
How Mayhem Works:Symbolic Execution
if x > 42
if x*x = 0xffffffff
vuln()
x can be anything
x > 42
(x > 42) ∧ (x*x == 0xffffffff)
ViolatesSafety Policyif x < 100
15
int outprintf( const char *fmt, … ){ int count; char buf[1024]; va_list args; va_start( args, fmt ); count = vsprintf( buf, fmt, args ); outwrite( buf, count ); // print out}
Safety Policy in Mayhem
outp
rint
f
…
fmt
ret addr
count
args
bufuser
inpu
t
mai
n
esp
Return to user-controlled address
EIP not affected by user input
Instruction Pointer (EIP) level:
16
Exploit Generation
Π∧
input[0-31] = attack code∧
input[1038-1042] = attack code address
Exploit is an input that satisfies the predicate:
Exploit PredicateCan transfer
control to attack code?
Can position attack code?
17
Challenges
Symbolic Execution Exploit Generation
Efficient Resource Management
Symbolic IndexChallenge
Hybrid ExecutionIndex-based Memory
Model
19
Current Resource Management in Symbolic Execution
Online Symbolic Execution
Offline Symbolic Execution
(a.k.a. Concolic)
20
Offline ExecutionOne pathat a time Method 1:
Re-run from scratch Inefficient⟹
Re-executedevery time
21
Online Execution
Method 2:Stop forking
Miss paths⟹
Method 3: Snapshot process
Huge disk ⟹image
Hit Resource Cap
Fork at branches
22
Mayhem: Hybrid Execution
Our Method:Don’t snapshot state; use path predicate to recreate state
9.4M 500K
Hit Resource Cap
Fork at branches
Ghostscript 8.62
“Checkpoint”
23
Hybrid Execution
Manage #executorsin memory within resource cap✓
Minimize duplicated work✓
Lightweight checkpoints✓
25
Symbolic Indices
x = user_input();y = mem[x];assert (y == 42);
x can be anything
Which memory cell contains 42?
232 cells to check
Memory0 232 -1
26
One Cause: Overwritten Pointers
42
mem[0x11223344]
mem[input]
…
arg
ret addr
ptr
buf
user
inpu
t
… assert(*ptr==42); return;
ptr address 11223344ptr = 0x11223344
27
Another Cause: Table Lookups
Table lookups in standard APIs:• Parsing: sscanf, vfprintf, etc.• Character test: isspace, isalpha, etc.• Conversion: toupper, tolower, mbtowc, etc.• …
28
Method 1: Concretization
Over-constrained• Misses 40% of exploits in our experiments
Π∧ mem[x] = 42 ∧ ’Π
Π ∧ x = 17∧ mem[x] = 42 ∧ ’Π
✓ Solvable✗ Exploits
29
Method 2: Fully Symbolic
Π ∧ mem[x] = 42 ∧ ’Π
✗ Solvable✓ Exploits
Π ∧ mem[x] = 42 ∧ mem[0] = v0 ∧…∧ mem[232-1] = v232-1
∧ ’Π
30
Our Observation
Path predicate (Π)constrains rangeof symbolic memoryaccesses
y = mem[x]
f t
x <= 42
x can be anything
f
t
x >= 50
Use symbolic execution state to:Step 1: Bound memory addresses referencedStep 2: Make search tree for memory address values
42 < x < 50Π
31
Step 1 — Find Boundsmem[ x & 0xff ]
1. Value Set Analysis1 provides initial bounds• Over-approximation
2. Query solver to refine bounds
Lowerbound = 0, Upperbound = 0xff
[1] Balakrishnan et al., Analyzing memory accesses in x86 executables, ICCC 2004
32
Step 2 — Index Search Tree Construction
y = mem[x]if x = 1 then y = 10
Index
MemoryValue
1012
22
20
if x = 2 then y = 12
if x = 3 then y = 22
if x = 4 then y = 20
ite( x < 3, left, right )
ite( x < 2, left, right )
33
Fully Symbolic vs.Index-based Memory Modeling
Fully Symbolic Index-based Piecewise Opt.0
5000
10000Time Timeout atphttpd
v0.4b
34
Index Search Tree Optimization:Piecewise Linear Approximation
y = 2*x + 10
y = - 2*x + 28
Index
MemoryValue
35
Piecewise Linear Approximation
Fully Symbolic Index-based Piecewise Opt.0
5000
10000Time
2x faster
atphttpd v0.4b
37
a2ps
aeon
aspell
atphttpd
freeradius
ghostscript
glftpd
gnugol
htget
htpasswd
iwconfig
mbse-bbs
nCompress
orzHttpd
psUtils
rsync
sharutils
socat
squirrel mail
tipxd
xgalaga
xtokkaetama
coolplayer
destiny
dizzy
galan
gsplayer
muse
soritong
1 10 100 1000 10000 100000
Linux
(22)
Windows
(7)
38
a2ps
aeon
aspell
atphttpd
freeradius
ghostscript
glftpd
gnugol
htget
htpasswd
iwconfig
mbse-bbs
nCompress
orzHttpd
psUtils
rsync
sharutils
socat
squirrel mail
tipxd
xgalaga
xtokkaetama
coolplayer
destiny
dizzy
galan
gsplayer
muse
soritong
1 10 100 1000 10000 100000
2 Unknown Bugs:FreeRadius,GnuGol
39
Limitations• We do not claim to find all exploitable bugs
• Given an exploitable bug, we do not guarantee we will always find an exploit
• Lots of room for improving symbolic execution, generating other types of exploits (e.g., info leaks), etc.
• We do not consider defenses, which may defend against otherwise exploitable bugs– Q [Schwartz et al., USENIX 2011]
But Every Report is Actionable
40
Related Work• APEG [Brumley et al., IEEE S&P 2008]
– Uses patch to locate bug, no shellcode executed
• Automatic Generation of Control Flow Hijacking Exploits for Software Vulnerabilities
[Heelan, MS Thesis, U. of Oxford 2009]– Creates control flow hijack from crashing input
• AEG [Avgerinos et al., NDSS 2011]
– Find and generate exploits from source code
• BitBlaze, KLEE, Sage, S2E, etc.– Symbolic execution frameworks
41
Conclusion• Mayhem automatically generated 29 exploits
against Windows and Linux programs
• Hybrid Execution– Efficient resource management for symbolic
execution
• Index-based Memory Modeling– Handle symbolic memory in real-world
applications