Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice...

25
Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present by Diego Velasquez

Transcript of Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice...

Page 1: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Automated Whitebox Fuzz TestingNetwork and Distributed System Security

(NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar

Present by

Diego Velasquez

Page 2: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

2

Acknowledgments

Figures are copy from the paper.

Some slides were taken from the original presentation presented by the authors

Page 3: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

3

Outline

Summary Goals Motivations Methods Experiments Results Conclusions

Review Strengths Weakness Extensions

Reference

Page 4: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Goals Propose a novel methodology that

performs efficiently fuzz testing.

Introduce a new search algorithm for systematic test generation.

Outcast their system SAGE (Scalable, Automated, Guided Execution)

4

Page 5: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Fuzz testing inserts random data to input of applications

in order to find defects of a software system. Heavily used in Security testing.

Pros: Cost effective and can find most of known bugs Cons: It has some limitations depending on some types of

branches, for example on project 2 in order to find bug # 10 we need to execute the if statement below.

if(address ==613 && value >= 128 && value<255)//Bug #7 printf("BUG 10 TRIGGERED);

Has (1 in 5000) * (128 in 2^32) in order to be executed if we know that is only 5000 addresses and value is a random 32-bit input

5

Page 6: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. Whitebox Fuzz Testing

Combine fuzz testing with dynamic test generation [2] Run the code with some initial input Collect constraints on inputs with symbolic

execution Generate new constraints Solve constraints with constraint solver Synthesize new inputs

6

Page 7: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. The Search Algorithm

figure 1 from [1]

Black box will do poorly in this case Dynamic test could do better

7

Page 8: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. Dynamic Approach

Input ‘good’ as example Collect constrain from trace Create a new path constraint

Figure 2 from [1]

8

Page 9: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. Limitations of Dynamic Testing

Path Explosion Path doesn’t scale to large in realistic programs. Can be corrected by modifying the search algorithm.

Imperfect Symbolic Execution Could be imprecise due to Complex program statements

(arithmetic, pointer manipulation) Calls to OS have to be expensive in order to be precise

9

Page 10: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. New Generation Search Algorithm

Figure 3 and figure 4 from [1] A type of Bread First Search with heuristic to get more

input test cases. Scores return the number of new test cases covered.

10

Page 11: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Methods Cont. Summary of Generation Search Algorithm

Push input to the list Run&Check(input) check bugs in that input Traverse the list by selecting from the list base in score Expanded child paths and adding to the childlist Traverse childlist Run&Check, assigned score and add to list

Expand Execution Generates Path constrain Attempt to expand path constraints and save them Input.bound is bound is used to limit the backtracking of each

sub-search above the branch.

11

Page 12: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Experiments

Can test any file-reading program running on Windows by treating bytes read from files as symbolic input.

Another key novelty of SAGE is that it performs symbolic execution of program traces at the x86 binary level

12

FIGURE FROM [2]

Page 13: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Experiments Cont. Sage advantages

Not source-based, SAGE is a machine-code-based, so it can run different languages.

Expensive to build at the beginning, but less expensive over time

Test after shipping, Since is based in symbolic execution on binary code, SAGE can

detects bugs after the production phase Not source is needed like in another systems

SAGE doesn’t even need specific data types or structures not easy visible in machine code

13

Page 14: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Experiments Cont. MS07-017: Vulnerabilities in Graphics Device Interface

(GDI) Could Allow Remote Code Execution. Test in different Apps such as image processors, media

players, file decoders.[2] Many bugs found rated as “security critical, severity 1,

priority 1”[2] Now used by several teams regularly as part of QA

process.[2]

14

Page 15: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Experiments Cont. More in MS07-017, figure below is from [2] left is input

right is crashing test case

15

RIFF...ACONLISTB...INFOINAM....3D Blue Alternate v1.1..IART....................1996..anih$...$...................................rate....................seq ....................LIST....framicon......... ..

RIFF...ACONBB...INFOINAM....3D Blue Alternate v1.1..IART....................1996..anih$...$...................................rate....................seq ....................anih....framicon......... ..

Only 1 in 232 chance at random!

Page 16: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Results Statistics from 10hour searches on seven

test applications, each seeded with a well formed input file.

16

Page 17: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Results Focused on the Media 1 and Media 2 parsers. Ran a SAGE search for the Media 1 parser with five

“well-formed” media files, and five bogus files.

Figure 7 from [1]

17

Page 18: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Results Compared with Depth-First Search Method

DFS runs for 10 hours for Media 2 with wff-2 and wff-3, didn’t find anything GS found 15 crashes

Symbolic Execution is slow

Well formed input are better than Bogus files

Non-determinism in Coverage Results.

The heuristic method didn’t have too much impact

Divergences are common

18

Page 19: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Results Most bugs found are “shallow”

Figure from [2]

19

1 2 3 4 5 6 7

0

0.5

1

1.5

2

2.5

3

3.5

# Unique First-Found Bugs

Page 20: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Conclusions Blackbox vs. Whitebox Fuzzing

Cost/precision tradeoffs Blackbox is lightweight, easy and fast, but poor coverage Whitebox is smarter, but complex and slower Recent “semi-whitebox” approaches

Less smart but more lightweight: Flayer (taint-flow analysis, may generate false alarms), Bunny-the-fuzzer (taint-flow, source-based, heuristics to fuzz based on input usage), autodafe, etc.

Which is more effective at finding bugs? It depends… Many apps are so buggy, any form of fuzzing finds bugs! Once low-hanging bugs are gone, fuzzing must become smarter: use

whitebox and/or user-provided guidance (grammars, etc.) Bottom-line: in practice, use both!

*Slide From [2]

20

Page 21: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

21

Strengths

Novel approach to do fuzz testing Introduced new search algorithm that use code-

coverage maximizing heuristic Applied as a black box

Not source code was needed symbolic execution of program at the x86 binary level

Shows results comparing previous results Test large applications previously tested found more

bugs. Introduced a full system and applied the novel

ideas in this paper

Page 22: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

22

Weakness

The results were non-determinism Same input, program and idea different results.

Only focus in specific areas X86 windows applications File manipulation applications

Well formed input still some type of regular fuzzing testing

SAGE needs help from different tools In my opinion the paper extends too much in the

implementation of SAGE, and the system could of be too specific to Microsoft

Page 23: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

23

Extensions

Make SAGE more general Easy to implement to another architectures Use for another types of applications Linux based applications

Better way to create input files May be used of grammar

Make the system deterministic Having different results make me think that it

could not be reliable.

Page 24: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

Reference [1] P Godefroid, MY Levin, D Molnar,

Automated Whitebox Fuzz Testing, NDSS, 2008.

[2] Original presentation slides www.truststc.org/pubs/366/15%20-%20Molnar.ppt

[3] Wikipedia Fuzz testing http://en.wikipedia.org/wiki/Fuzz_testing.

24

Page 25: Automated Whitebox Fuzz Testing Network and Distributed System Security (NDSS) 2008 by Patrice Godefroid, Michael Y. Levin, and David Molnar Present.

25

Questions, Comments or Suggestions?