Template-Guided Concolic Testing via Online Learning - Korea...

Post on 08-Mar-2021

5 views 0 download

Transcript of Template-Guided Concolic Testing via Online Learning - Korea...

Template-Guided Concolic Testing via Online Learning

Sooyoung Cha

Korea University

ASE'18 @Montpellier, France

(co-work with Seonho Lee and Hakjoo Oh)

2

Concolic Testing

● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.

SAGE

3

Concolic Testing

● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.

● Open Challenge: Path Explosion– # of execution paths: 2

● ex) grep-2.2(3,836): 2 paths (worst case) – Exploring all paths is impossible.

3,836

SAGE

# of branches

4

My Research Area

● Motivation. Search Heuristic

Path-ExplosionSearch Space Reduction

Seed Input, Constraint Solver, …Experts

(Manually)

5

My Research Area

● Motivation. Search Heuristic

Path-ExplosionSearch Space Reduction

Seed Input, Constraint Solver, …Experts

(Manually)

● Data-Driven Concolic Testing.

Search Heuristic (ICSE’18)

Search Space Reduction (ASE’18) Machine

(Automatically)

6

Search Heuristic

● Selecting branches that are likely to maximize code coverage.● Having its own criteria to pick a branch.

– DFS, BFS, Random, Generational, CFDS, CGS, ParaDySE, ...

b1

b2

b3

Solve(b1∧b2∧¬b3)

b1

b2

b3b4

b5

Solve(b1∧¬b2)

1st execution 2nd execution

...

Motivation

● Using search heuristics alone is not sufficient.– Code coverage converges in practical settings.

7h and 10 cores in parallel !!

8

Goal: Draw red graph line

● Improving branch coverage in practical settings.– Branch coverage Bug-finding↑ → Bug-finding → Bug-finding ↑ → Bug-finding

9

Key Observation

● Tracks all input values as symbolic. – Conventional Concolic Testing.

α1

α2

α3

α4

α5

α6

α7

α8Input :

search heuristic

...

...

......

......

......

10

Key Observation

...

α1

α2

α3

α4

α5

α6

α7

α8Input :

search heuristic

...

Reducing the search space of concolic testing !

search heuristic

...

...

......

......

......

● Tracks all input values as symbolic. – Conventional Concolic Testing.

11

Key Ideas

● Template-Guided Concolic Testing.– Template: a partially symbolized input.

ex)

● Selectively treat input values as symbolic.● Replace unselected input values with concrete inputs.

● Online Learning.– Automatically generating, using, and refining templates.

‘-’ ‘g’ α3

α4 ‘1’ ‘d’ α

8

12

Effectiveness

● Achieve greater branch coverage in practical settings.

● Find real bugs in the latest versions of C programs. – grep-3.1, sed-4.4, gawk-4.21

13

Template-Guided Concolic Testing

● Template– A set of concrete values and their positions.

(ex) Template1: {(3, ‘-’), (4, ‘S’), (6, ‘A’), (8,‘L’)}

● Challenge: Finding effective templates ! – Choosing input values to track symbolically.– Replacing the remaining inputs with appropriate concrete values.

α1

α2

α3

α4

α5

α6

α7

α8

α1

α2 ‘-’ ‘S’ α

5 ‘A’ α7 ‘L’

Conventional Concolic Testing Template-Guided Concolic Testing

14

Template-Guided Concolic Testing with Online Learning

1. ConventionalConcolic Testing

2. SequentialPatternMining

3. PatternRanking

4. Pattern to Template

5. Template-GuidedConcolic Testing

pgm

● Goal – Perform concolic testing while learns useful templates online.

15

1. Conventional Concolic Testing

● Collect effective test-cases during concolic testing. – “Effective test-cases” :– Collecting all test-cases can cause serious performance degradation.

ConventionalConcolic Testing

Effective Test-Cases− X * *− 2 R L2 X ? #

...− Y − 5

− − P −− − s y− − c l

...3 − s h

Input2 :Input1 :

α1 α2 α3 α4 α5 α6 α7 α8

pgm SequentialPatternMining

16

2. Sequential Pattern Mining

● Extract common patterns from the effective test-cases.– Call “sequential pattern mining” in data mining community.– Use a recent algorithm: CloFast(1).

ex) 14,604 effective test-cases → Bug-finding 6,176 patterns (in 5 min)

(1). Fabio Fumarola, Pasqua Fabiana Lanotte, Michelangelo Ceci, and Donato Malerba. 2016. CloFAST: closed sequential pattern mining using sparse and vertical id-lists. Knowledge and Information Systems.

− − s

− − − − X − − − X −

P1 :

P2 : P3 :P4 :

Candidate PatternsEffective Test-Cases− X * *− 2 R L2 X ? #

...− Y − 5

− − P −− − s y− − c l

...3 − s h Sequential

PatternMining

PatternRanking

17

3. Pattern Ranking

● Choose the top-k patterns from the candidates.● The idea for ranking.

– Reflect the experience of previously evaluated patterns.

− − −

− − s − X − − − X −

P1 :

P2 : P3 :P4 : Pattern

Ranking

1. Candidate Patterns

2. Good, Bad Pattern sets

The top-K Patterns − X − −

− X − − − − − − s

Top 1 :

Top 2 : Top 3 : Top 4 :

Good P P1 : − X X −

Bad P P2 : − s −

Pattern to Template

18

4. Pattern to Template

● Transform the top-k patterns into templates.

(1). Collect the test-cases containing the pattern.

(2). Identify the positions where each value appears most frequently.

− −

Effective Test-Cases− X * *− 2 R X− X ? #

− − P −− − s y− c − l

P1: − X − − + − XT1:

Templates− X T1 :

− X − T2 :

The top-K Patterns − X − −

− X − − − − − − s

Top 1 :

Top 2 : Top 3 : Top 4 : Pattern to

TemplateTemplate-GuidedConcolic Testing

− −

19

5. Template-Guided Concolic Testing

● Run concolic testing with templates.– Evaluate the quality of each template. – Accumulate in good or bad patterns.

● Good P : # of branches covered by T1 > threshold● Bad P: # of branches covered by T2 ≤ 1

Templates− X

− X −Template-GuidedConcolic Testing

− − PatternRanking

ConventionalConcolic Testing

Good P P1 : − X X − +P3: − X − −

Bad P P2 : − s − +P4: − X −

T1 :

T2 :

20

Template-Guided Concolic Testing with Online Learning

● Can select more useful patterns based on increased knowledge.

1. ConventionalConcolic Testing

2. SequentialPatternMining

3. PatternRanking

4. Pattern to Template

5. Template-GuidedConcolic Testing

pgm

Good P

P1: − X X −

P3: − X − −

+ P19: − d u

Bad P

P2 : − s −

P4: − X −

+ P20: − k p

Knowledge

21

Experiments

● Implemented in CREST.● Used 5 open-source C programs.

● Compared with conventional concolic testing.– CGS, CFDS, Random, Generational, DFS, ParaDySE.

● Applied our technique on the best search heuristic.– T-CGS, T-CFDS.

Program # Total branches LOC

vim-5.7 35,464 165K

gawk-3.0.3 8,038 30K

grep-2.2 3,836 15K

sed-1.17 2,650 9K

tree-1.6.0 1,440 4K

22

The Same Evaluation Settings

● The same testing budget.– vim: 70h

– gawk, grep, sed, tree: 7h

● The same cores.– using 10 cores in parallel.

● The same initial inputs.

23

Effectiveness

● Accumulated branch coverage (5 red lines )

T-CGS exclusively covered 833 branches

24

Effectiveness

● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)

25

Effectiveness

● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)

● Demo: attacking our lab server. – All the memory of our lab server will be exhausted.

26

Learned Patterns

● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.

27

Learned Patterns

● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.

Manually selecting good patterns is highly tricky.

>

Tool

● Make our tool publicly available.● Name our tool via template-guided approach.

– Challenge : Choosing input values to track symbolically.

C o n c o l i c T e s t i n g

Tool: ConTest

● Make our tool publicly available.● Name our tool via template-guided approach.

– Challenge : Choosing input values to track symbolically.

– Learned Template:● {(0, ‘C’), (1, ‘o’), (2, ‘n’), (9,‘T’), (10, ‘e’), (11, ‘s’), (12, ‘t)}

C o n T e s t

Thank You

URL: https://github.com/kupl/ConTest