1 Marple: A Demand-Driven Path- Sensitive Buffer Overflow Detector Wei Le and Mary Lou Soffa...

29
1 Marple: A Demand-Driven Path- Sensitive Buffer Overflow Detector Wei Le and Mary Lou Soffa University of Virginia

Transcript of 1 Marple: A Demand-Driven Path- Sensitive Buffer Overflow Detector Wei Le and Mary Lou Soffa...

1

Marple: A Demand-Driven Path-Sensitive Buffer Overflow

Detector

Wei Le and Mary Lou SoffaUniversity of Virginia

22

Motivation: Buffer Overflow• 20 years since exploited by Morris worm • Always a popular attack vector

– E.g., 482 new exploitable vulnerabilities 204 buffer overflows reported by SecuriTeam in 2007

• Remain due to legacy code and the fact that many companies still heavily depend on C and C++

33

Challenge : Reduce attacks

• Detect and report where vulnerabilities occur

• Determine cause and remove it

• Be automatic and usable with manageable manual effort

• Scale to large software

4

A framework, Marple, for detecting buffer overflow:

• As precise as possible

• Helpful for understanding and removing overflow

• Scalable

• Key idea: Identify paths that lead to buffer overflow

• Approach: – Interprocedual path-sensitive for precision

and help diagnosis

– Demand-driven for scalability

4

Our Goals and Overall Approach

5

• Value of paths and paths classification

• Demand-driven analysis

• Vulnerability model

• Framework summary

• Experiments

• Conclusions

5

Outline of the talk

6

i = strlen (a→q_user)

i ≥ sizeof (buf0)

buf = xalloc (i+1) buf = buf0

strcpy(buf, a→q_user)

1

2

3 4

5

yes no

Paths-Insensitive: Detecting an Overflow

buf = xalloc (i+1) V buf0

i ≥ sizeof (buf0)

i < sizeof (buf0)

7

i = strlen (a→q_user)

i ≥ sizeof (buf0)

buf = xalloc (i+1) buf = buf0

strcpy(buf, a→q_user)

1

2

3 4

5

yes no

Paths-Sensitive: Detecting an Overflow

i ≥ sizeof (buf0) buf = xalloc (i+1)

i < sizeof (buf0) buf = buf0

8

n

rootd = 1 rootd = 0

strlen(wbuf)+rootd+1+strlen(resolved) > LEN

rootd == 0

strcat(resolved, “/”)

strcat(resolved, wbuf)

exit

yn

y

y n

wu-ftpd 2.6.2 realpath.c

1

2 3

4

5

6

7

8

Paths-Insensitive: Reporting an Overflow

9

n

rootd = 1 rootd = 0

strlen(wbuf)+rootd+1+strlen(resolved) > LEN

rootd == 0

strcat(resolved, “/”)

strcat(resolved, wbuf)

exit

yn

y

y n

Safe Overflow Infeasibl

e

wu-ftpd 2.6.2 realpath.c

1

2 3

4

5

6

7

8

Paths-Sensitive: Reporting an Overflow

10

• Infeasible: no input can exercise the path

• Safe: no input can overflow the buffer

• Vulnerable: users can write any content to the buffer

• Overflow-user-independent: the buffer content is statically determinable

• Don’t-know: the buffer status cannot be judged statically 10

Five Types of Paths

11

Demand-Driven Analysis for Buffer Overflow

• Two Steps:– Find all potentially overflow statements

in the program

– Examine paths from a potentially overflow statement to the entry to see if an overflow can occur - backwards

• Benefits: scalability and natural parallelism

12

Vulnerability Model 5-tuple (POS, δ, UPS, γ, r), where POS and UPS

are finite sets, and

• POS: set of potentially overflow statements

• δ: mapping POS->Q, and Q is set of buffer queries

• UPS: set of statements where queries are updated

• r : mapping UPS->E, where E is set of equations

• R: general security policy to judge the termination of the search

13

Partial Vulnerability Model for Buffer Overflow

POS/PUS Query Equations

strcpy(a,b) Size(a) > Len(b) Len’(a) = Len(b)

strcat(a,b) Size(a) > Len(a) + Len(b) Len’(a) = Len(b) + Len(a)

strncpy(a,b,n) Size(a) > Min(Len(b), n) (Len’(a) = ∞ && Len(b) >= n) ||

(Len’(a) = Len(b) &&Len(b) < n)

a[i] = ’t’ Size(a) > i Len’(a) = ∞

Security policy - after a write to the buffer, the declared buffer size must be no less than the length of the string stored in the bufferAnswers - infeasible, safe, vulnerable, overflow-input-independent, and don’t-know

14

n

rootd = 1 rootd = 0

strlen(wbuf)+rootd+1+strlen(resolved) > LEN

rootd == 0

strcat(resolved, “/”)

strcat(resolved, wbuf)

exit

yn

y

y n1

2 3

4

5

6

7

8

Demand-Driven Analysis: An Example

……char resolved [LEN ]

Q (s+1<l, f)

Q (LEN-rootd<l, f)

Solved

Q (s+1<l, f)

Infeasible

Q (s<l, f)

s: strlen(resolved)+strlen(wbuf) l: sizeof(resolved) f: wbuf

Q (LEN<l, f)

15

POS

Queries

Equations

Policy

Detect Infeasible

Paths

Program

The Vulnerability Model

noyes

Source

Raise Queries

Propagate Queries

Update Queries

Evaluate Queries

Propagate Answers

Assist Diagnosis

The Demand-Driven Path-Sensitive Analyzer

Path Classification

Root Cause Information

Marple Framework

16

Entry

POS

User Scenario

A

17

Entry

POS

VulnerableOverflow User Independent

User Scenario

18

Entry

POS

VulnerableOverflow User Independent

User Scenario

19

Entry

POS

Root Cause

VulnerableOverflow User Independent

User Scenario

20

• Goals– More precisely find vulnerabilities

– False positives in vulnerable set

– Scalable

– Help in diagnosis

– Comparison with other tools

• Experimental Setup– Microsoft Phoenix, Disolver

– BugBench, Buffer Overflow Benchmark, MechCommander2(570.9K)

20

Experiments

21

Results: Detection

Benchmark

POS Detected Bugs

Reported

New

polymorph

15 3 4

ncompress

38 1 11

gzip 38 1 9

bc 245 3 3

wu-ftp 13 4 0

sendmail 21 2 2

BIND 48 1/0 0

MechCommander2

1512

1/0 28/1

• Detect 14 out of 16 documented overflow

-1 don’t-know : library call

- 1 missing: function pointers

• Report 57 new overflowssame path of different buffers

• Generate 1 false positive due to integer range analysis

22

Results: Path-SensitivityBenchma

rkPOS Path

Prioritization

V O U

polymorph

15 6 1 2

ncompress

38 8 4 12

gzip 38 7 3 18

bc 245 2 3 108

wu-ftp 13 3 1 4

sendmail 21 3 1 6

BIND 48 0 0 22

MechCommander2

1512

28 0 487

• All types of paths occur

• 108 don’t knows from bc

43 complex pointers28 recursive procedures15 loops12 non-linear operations 8 library calls

23

Results: Root CauseBenchma

rkPOS Root Cause

Info

Stmt Ave. No

Polymorph

15 2.9 1.7

ncompress

38 3.9 1.0

gzip 38 4.2 1.7

bc 245 7.1 1.0

wu-ftp 13 6.8 1.0

sendmail 21 6.5 1.2

BIND 48 N/A N/A

MechCommander2

1512

9.4 1.0

• Highlight statements that update query during analysis as root cause information

• Average highlighted less than 10

• Path-sensitive root cause exists

24

Marple with static tools

• Used Buffer Overflow Benchmark – 14 programs“Bad” version – several overflows marked“Good” version – overflows fixed

• Static Tools: Archer, Boon, UNO, Splint and Polyspace (commercial tool)

• Criteria: probability of detection and probability of false alarms

25

0

0.25

0.5

0.75

1

0 0.25 0.5 0.75 1P(f) – probability of false

alarms

P(d

) –

Pro

b

of d

etec

tion

BOON

Splint (0.43,0.57

)

PolySpace (0.5,0.87)

ARCHER, UNO

ROC Curve

Marple-B (0.42, 0.88)

Marple-A (0.04, 0.49)

Ideal Tool (0,1)

Marple A - using onlyVulnerable/overflow

Marple B – Marple A + Don’t know

Zitser, LippmannAnd Leek, FSE

Marple with static tools

26

Performance

• Visited: 43% of nodes; 52% of procedures

• Memory – 2.5GB• Time

– MechComander2 (575K lines) – 35.4 minutes

– Archer – 121 lines/sec– IPSSA – 155 lines/sec– Marple – 254 lines/sec

2727

• Static Detection for Buffer OverflowARCHER[03xie] BOON[00wagner] ESPx[06hackett] Prefast[ms] Prefix[00bush] Splint[96evans]

• Path-Sensitive Analysis for DefectsARCHER[03xie] ESPx[06hackett] ESP [02das] IPSSA[03livshits] MOPS[02check]

Prefix[00bush]

• Demand-Driven Approach− A general framework[96Duesterwald]

− Application for dataflow computation[96Duesterwald], infeasible detection[97bodik], memory leak[06Orlovich] , postmortem analysis[04Manevich]

Related Work

28

• An interprocedual demand-driven path-sensitive buffer overflow detection for large software

• A categorization of paths to assist diagnosis

• The identification of vulnerable path segments and the statements relevant to the root cause

• Our results demonstrate that Marple is scalable and can report buffer overflow with low false positive rates and rich diagnosis information

28

Conclusions

2929

Thank you and Questions?