1 Refining Buffer Overflow Detection via Demand-Driven Path-Sensitive Analysis Wei Le and Mary Lou...

19
1 Refining Buffer Overflow Detection via Demand-Driven Path-Sensitive Analysis Wei Le and Mary Lou Soffa University of Virginia sotesty.cs.virginia. edu

Transcript of 1 Refining Buffer Overflow Detection via Demand-Driven Path-Sensitive Analysis Wei Le and Mary Lou...

1

Refining Buffer Overflow Detection via Demand-Driven

Path-Sensitive Analysis

Wei Le and Mary Lou SoffaUniversity of Virginia

sotesty.cs.virginia.edu

2

• Buffer overflow: 20 years since Morris Worm, still the most common exploit

• Challenge: eliminate exploitable buffer overflows

– Detect where buffer overflow can occur

– Determine cause and remove it2

Motivation

3

• Detection Precision: false positives

• Report for errors does not provide much information for diagnosis

– report an overflow point in the program

• Not fully automatic: manual annotation

3

Problems of Static Approaches

4

• Goal: automatically identify paths on which a buffer overflow can occur and report the path segment that causes the overflow

• Challenge: huge number of paths

• Approach:

– interprocedual path-sensitive for precision and help diagnosis

– demand-driven for scalability 4

Our Goals and Approaches

5

• Infeasible: no input can exercise the path

• Safe: no input can overflow the buffer

• Vulnerable: users can write any content to the buffer

• Overflow-user-independent: the buffer content is statically determinable

• Don’t-know: the buffer status cannot be judged statically 5

Five Types of Paths

6

n

rootd = 1 rootd = 0

strlen(wbuf)+rootd+1+strlen(resolved) > LEN

rootd == 0

strcat(resolved, “/”)

strcat(resolved, wbuf)

exit

yn

y

y n

Safe Overflow Infeasibl

e

wu-ftpd 2.6.2 realpath.c

1

2 3

4

5

6

7

8

An Example

\0

wbuf resolved

\0

LEN = 6

7

n

rootd = 1 rootd = 0

strlen(wbuf)+rootd+1+strlen(resolved) > LEN

rootd == 0

strcat(resolved, “/”)

strcat(resolved, wbuf)

exit

yn

y

y n1

2 3

4

5

6

7

8

Demand-Driven Analysis

……char resolved [LEN ]

Q1 (s+1<l, f)

Q1

Q052 (LEN-1<l, f)

Q05 (LEN-1-rootd<l, f)

Q15 (LEN-rootd<l, f)

Solved

Q0

Infeasible

Q0 (s<l, f)

s: strlen(resolved)+strlen(wbuf) l: sizeof(resolved) f: wbuf

Q053 (LEN-1<l, f)

Q153 (LEN<l, f)

8

• PVS (potentially vulnerable statement) strcpy(a,b)

• Query sizeof(a) > strlen(b), flag

• Information for Updating Queries char a[9]

• Propagation Rules

interprocedural, loop, join point, infeasible

• Resolving the Query false, flag = user input

8

The Demand-Driven Model

9

Raise Query

Yes Propagate Query

UpdateQuery

ResolveQuery

PropagateResults

LabelPaths

No

FeasibilityDetection

InfeasiblePaths

Node Information

PVS

Program

OverflowProperties

9

Approach

10

• Purpose

− Existence of the 5 types of paths

− Benefit of demand-driven analysis

• Implementation: Microsoft Phoenix APIs[phoenix]

• Benchmarks

− 9 programs, size 0.4-97.3K LOC

− the BugBench[06lu] and Buffer Overflow Benchmark[03Zitser]

10

Experiments

1111

Experimental ResultsBenchmark Path Types

Vul CNST UnK Safe

polymorph-0.4.0 966 0 0 0

ncompress-4.2.4 288 0 0 0

man-1.5h1 16 0 0 24

gzip-1.2.4 1 0 0 0

bc-1.06 0 >50,000 0 >30,000

squid-2.3 0 0 4 2

wu-ftp 4320 0 0 18,624

sendmail 48 0 0 648

BIND 0 0 2 0

12

• All defined types of paths exist

• Problematic paths manifest certain complexity

• Memory usage: 9-65MB

• Time cost: 0.24-102.6s

Experimental Results

13

Entry

PVS

User Scenario

14

Entry

PVS

VulnerableOverflow User Independent

User Scenario

15

Entry

PVS

VulnerableOverflow User Independent

User Scenario

16

Entry

PVS

Root Cause

VulnerableOverflow User Independent

User Scenario

Benchmark Average Path Size

#P #B

polymorph-0.4.0 2.5 25.9

ncompress-4.2.4 2.0 27.8

man-1.5h1 1.8 14.3

gzip-1.2.4 3.0 5

squid-2.3 1.0 6.8

wu-ftp 3.8 33.6

sendmail 2.0 35.5

BIND 2.0 23.5

1717

• Static Detection for Buffer OverflowARCHER[03xie] BOON[00wagner] ESPx[06hackett] Prefast[ms] Prefix[00bush] Splint[96evans]

• Path-Sensitive Analysis for DefectsARCHER[03xie] ESPx[06hackett] ESP [02das] IPSSA[03livshits] MOPS[02check]

Prefix[00bush]

• Demand-Driven Approach− A general framework[96Duesterwald]

− Application for dataflow computation[96Duesterwald], infeasible detection[97bodik], memory leak[06Orlovich] , postmortem analysis[04Manevich]

Related Work

18

• A categorization of five types of paths for buffer overflow

• An interprocedual demand-driven path-sensitive diagnosis tool for identifying the type of paths through a potential overflow

• Experimental results that demonstrate the path types existing in real program

18

Conclusions

1919

Thank you and Questions?