Template-Guided Concolic Testing via Online Learning - Korea...

29
Template-Guided Concolic Testing via Online Learning Sooyoung Cha Korea University ASE'18 @Montpellier, France (co-work with Seonho Lee and Hakjoo Oh)

Transcript of Template-Guided Concolic Testing via Online Learning - Korea...

Page 1: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

Template-Guided Concolic Testing via Online Learning

Sooyoung Cha

Korea University

ASE'18 @Montpellier, France

(co-work with Seonho Lee and Hakjoo Oh)

Page 2: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

2

Concolic Testing

● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.

SAGE

Page 3: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

3

Concolic Testing

● Concolic testing (Concrete and Symbolic executions)– An effective software testing method.– SAGE : Find 30% of all Windows 7 WEX security bugs.

● Open Challenge: Path Explosion– # of execution paths: 2

● ex) grep-2.2(3,836): 2 paths (worst case) – Exploring all paths is impossible.

3,836

SAGE

# of branches

Page 4: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

4

My Research Area

● Motivation. Search Heuristic

Path-ExplosionSearch Space Reduction

Seed Input, Constraint Solver, …Experts

(Manually)

Page 5: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

5

My Research Area

● Motivation. Search Heuristic

Path-ExplosionSearch Space Reduction

Seed Input, Constraint Solver, …Experts

(Manually)

● Data-Driven Concolic Testing.

Search Heuristic (ICSE’18)

Search Space Reduction (ASE’18) Machine

(Automatically)

Page 6: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

6

Search Heuristic

● Selecting branches that are likely to maximize code coverage.● Having its own criteria to pick a branch.

– DFS, BFS, Random, Generational, CFDS, CGS, ParaDySE, ...

b1

b2

b3

Solve(b1∧b2∧¬b3)

b1

b2

b3b4

b5

Solve(b1∧¬b2)

1st execution 2nd execution

...

Page 7: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

Motivation

● Using search heuristics alone is not sufficient.– Code coverage converges in practical settings.

7h and 10 cores in parallel !!

Page 8: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

8

Goal: Draw red graph line

● Improving branch coverage in practical settings.– Branch coverage Bug-finding↑ → Bug-finding → Bug-finding ↑ → Bug-finding

Page 9: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

9

Key Observation

● Tracks all input values as symbolic. – Conventional Concolic Testing.

α1

α2

α3

α4

α5

α6

α7

α8Input :

search heuristic

...

...

......

......

......

Page 10: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

10

Key Observation

...

α1

α2

α3

α4

α5

α6

α7

α8Input :

search heuristic

...

Reducing the search space of concolic testing !

search heuristic

...

...

......

......

......

● Tracks all input values as symbolic. – Conventional Concolic Testing.

Page 11: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

11

Key Ideas

● Template-Guided Concolic Testing.– Template: a partially symbolized input.

ex)

● Selectively treat input values as symbolic.● Replace unselected input values with concrete inputs.

● Online Learning.– Automatically generating, using, and refining templates.

‘-’ ‘g’ α3

α4 ‘1’ ‘d’ α

8

Page 12: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

12

Effectiveness

● Achieve greater branch coverage in practical settings.

● Find real bugs in the latest versions of C programs. – grep-3.1, sed-4.4, gawk-4.21

Page 13: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

13

Template-Guided Concolic Testing

● Template– A set of concrete values and their positions.

(ex) Template1: {(3, ‘-’), (4, ‘S’), (6, ‘A’), (8,‘L’)}

● Challenge: Finding effective templates ! – Choosing input values to track symbolically.– Replacing the remaining inputs with appropriate concrete values.

α1

α2

α3

α4

α5

α6

α7

α8

α1

α2 ‘-’ ‘S’ α

5 ‘A’ α7 ‘L’

Conventional Concolic Testing Template-Guided Concolic Testing

Page 14: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

14

Template-Guided Concolic Testing with Online Learning

1. ConventionalConcolic Testing

2. SequentialPatternMining

3. PatternRanking

4. Pattern to Template

5. Template-GuidedConcolic Testing

pgm

● Goal – Perform concolic testing while learns useful templates online.

Page 15: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

15

1. Conventional Concolic Testing

● Collect effective test-cases during concolic testing. – “Effective test-cases” :– Collecting all test-cases can cause serious performance degradation.

ConventionalConcolic Testing

Effective Test-Cases− X * *− 2 R L2 X ? #

...− Y − 5

− − P −− − s y− − c l

...3 − s h

Input2 :Input1 :

α1 α2 α3 α4 α5 α6 α7 α8

pgm SequentialPatternMining

Page 16: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

16

2. Sequential Pattern Mining

● Extract common patterns from the effective test-cases.– Call “sequential pattern mining” in data mining community.– Use a recent algorithm: CloFast(1).

ex) 14,604 effective test-cases → Bug-finding 6,176 patterns (in 5 min)

(1). Fabio Fumarola, Pasqua Fabiana Lanotte, Michelangelo Ceci, and Donato Malerba. 2016. CloFAST: closed sequential pattern mining using sparse and vertical id-lists. Knowledge and Information Systems.

− − s

− − − − X − − − X −

P1 :

P2 : P3 :P4 :

Candidate PatternsEffective Test-Cases− X * *− 2 R L2 X ? #

...− Y − 5

− − P −− − s y− − c l

...3 − s h Sequential

PatternMining

PatternRanking

Page 17: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

17

3. Pattern Ranking

● Choose the top-k patterns from the candidates.● The idea for ranking.

– Reflect the experience of previously evaluated patterns.

− − −

− − s − X − − − X −

P1 :

P2 : P3 :P4 : Pattern

Ranking

1. Candidate Patterns

2. Good, Bad Pattern sets

The top-K Patterns − X − −

− X − − − − − − s

Top 1 :

Top 2 : Top 3 : Top 4 :

Good P P1 : − X X −

Bad P P2 : − s −

Pattern to Template

Page 18: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

18

4. Pattern to Template

● Transform the top-k patterns into templates.

(1). Collect the test-cases containing the pattern.

(2). Identify the positions where each value appears most frequently.

− −

Effective Test-Cases− X * *− 2 R X− X ? #

− − P −− − s y− c − l

P1: − X − − + − XT1:

Templates− X T1 :

− X − T2 :

The top-K Patterns − X − −

− X − − − − − − s

Top 1 :

Top 2 : Top 3 : Top 4 : Pattern to

TemplateTemplate-GuidedConcolic Testing

− −

Page 19: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

19

5. Template-Guided Concolic Testing

● Run concolic testing with templates.– Evaluate the quality of each template. – Accumulate in good or bad patterns.

● Good P : # of branches covered by T1 > threshold● Bad P: # of branches covered by T2 ≤ 1

Templates− X

− X −Template-GuidedConcolic Testing

− − PatternRanking

ConventionalConcolic Testing

Good P P1 : − X X − +P3: − X − −

Bad P P2 : − s − +P4: − X −

T1 :

T2 :

Page 20: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

20

Template-Guided Concolic Testing with Online Learning

● Can select more useful patterns based on increased knowledge.

1. ConventionalConcolic Testing

2. SequentialPatternMining

3. PatternRanking

4. Pattern to Template

5. Template-GuidedConcolic Testing

pgm

Good P

P1: − X X −

P3: − X − −

+ P19: − d u

Bad P

P2 : − s −

P4: − X −

+ P20: − k p

Knowledge

Page 21: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

21

Experiments

● Implemented in CREST.● Used 5 open-source C programs.

● Compared with conventional concolic testing.– CGS, CFDS, Random, Generational, DFS, ParaDySE.

● Applied our technique on the best search heuristic.– T-CGS, T-CFDS.

Program # Total branches LOC

vim-5.7 35,464 165K

gawk-3.0.3 8,038 30K

grep-2.2 3,836 15K

sed-1.17 2,650 9K

tree-1.6.0 1,440 4K

Page 22: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

22

The Same Evaluation Settings

● The same testing budget.– vim: 70h

– gawk, grep, sed, tree: 7h

● The same cores.– using 10 cores in parallel.

● The same initial inputs.

Page 23: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

23

Effectiveness

● Accumulated branch coverage (5 red lines )

T-CGS exclusively covered 833 branches

Page 24: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

24

Effectiveness

● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)

Page 25: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

25

Effectiveness

● Bug-finding.– The five bug inputs for the latest versions of C programs.– our technique(all inputs) > conventional(2/5 inputs)

● Demo: attacking our lab server. – All the memory of our lab server will be exhausted.

Page 26: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

26

Learned Patterns

● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.

Page 27: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

27

Learned Patterns

● Top 5 good and bad patterns for increasing branch coverage.– Good and bad patterns are hardly distinguishable.

Manually selecting good patterns is highly tricky.

>

Page 28: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

Tool

● Make our tool publicly available.● Name our tool via template-guided approach.

– Challenge : Choosing input values to track symbolically.

C o n c o l i c T e s t i n g

Page 29: Template-Guided Concolic Testing via Online Learning - Korea …prl.korea.ac.kr/~sooyoung/slides/ASE18-slides.pdf · 2020. 1. 31. · 16 2. Sequential Pattern Mining Extract common

Tool: ConTest

● Make our tool publicly available.● Name our tool via template-guided approach.

– Challenge : Choosing input values to track symbolically.

– Learned Template:● {(0, ‘C’), (1, ‘o’), (2, ‘n’), (9,‘T’), (10, ‘e’), (11, ‘s’), (12, ‘t)}

C o n T e s t

Thank You

URL: https://github.com/kupl/ConTest