Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

28
Predicting Bugs Using Antipatterns Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1

Transcript of Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

Page 1: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

1

Predicting Bugs Using Antipatterns

Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan,

Ahmed E. Hassan

Page 2: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

2

Page 3: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

3

Predict Bugs

Model

Code

Antipattern

s

Past Defects, History of Churn (Zimmermann, Hassan et al.)

Topic Modeling (Chen et al.)

Page 4: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

4

Page 5: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

5

not technically incorrect and don't prevent a system from functioning

weaknesses in design

Antipatter

ns

Page 6: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

6

Indicate a deeper problem in the

system

Page 7: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

7

Antipatterns indicate weaknesses in the design that may increase the risk for bugs in the future. (Fowler 1999)

Motivation

Page 8: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

9

Approach

CVS RepositoryMining Source Code

Repositories

Detecting Antipatterns

Mining Bug RepositoriesBugzilla

Calculating Metrics Analyzing

RQ1

RQ2

RQ3

Page 9: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

10

Systems Release(#) Churn LOCs

Eclipse 2.0 - 3.3.1(12) 148,454 26,209,669

ArgoUML 0.12 - 0.26.2(9) 21,427 2,025,730

Studied SystemsStudied Systems

Mining Source Code Repositories

Page 10: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

12

Detecting Antipatterns

13 different antipatterns

DECOR (Moha et al.)

# of Antipatterns

# Files

Systems #Antipatterns

Eclipse 273,766

ArgoUML 15,100

Page 11: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

14

Research Questions

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

Page 12: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

15

RQ1: Do antipatterns affect the density of bugs in files?

Null Hypothesis

Density of bugs in the files with antipatterns and the other files without antipatterns is the same.

Wilcoxon rank sum test

Page 13: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

16

Systems Releases(#) DA – DNA> 0 p-value<0.05

Eclipse 12 8 8

ArgoUML 9 6 6

Files with Antipatterns

Density of Bugs

Files without Antipatterns

Density of Bugs

RQ1: Do antipatterns affect the density of bugs in files?

Page 14: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

17

Research Questions

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

Page 15: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

18

RQ2: Metrics

Average Number of Antipatterns (ANA)

Antipattern Cumulative Pairwise Differences (ACPD)

Antipattern Recurrence Length(ARL)

Antipattern Complexity Metric (ACM)

Page 16: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

19

1.0 2.0 3.0 4.0 5.0 6.0

a.java

b.java

c.java

3 4 0 2 1 3

4 5 1 0 0 3

0 6 5 4 5 4

ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0

RQ2: Example

Page 17: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

20

Page 18: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

21

Provide additional explanatory power over traditional metrics

ARL shows the biggest improvement

Page 19: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

22

Research Questions

RQ1: Do antipatterns affect the density of bugs in files?

RQ2: Do the proposed antipattern based metrics provide additional explanatory power over traditional metrics?

RQ3: Can we improve traditional bug prediction models with antipatterns information?

Page 20: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

23

RQ3: Can we improve traditional bug prediction models with antipatterns information?

Intra System Models

Step-wise analysis1) Removing Independent

Variables2) Collinearity Analysis

Metric name Description

LOC Source lines of codes

MLOC Executable lines of codes

PAR Number of parameters

NOF Number of attributes

NOM Number of methods

NOC Number of children

VG Cyclomatic complexity

DIT Depth of inheritance tree

LCOM Lack of cohesion of methods

NOT Number of classes

WMC Number of weighted methods per class

PRE Number of pre-released bugs

Churn Number of lines of code addedmodified or deleted

Page 21: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

Churn PRE LOC MLOC NOT NOF NOM ACM ACPD ARL0123456789

ArgoUML

24

Churn PRE LOC MLOC NOT NOF NOM ACM ACPD ARL0

2

4

6

8

10

12

Eclipse

ARL remained statistically significant and had a low collinearity with other metrics

# Ve

rsio

ns#

Vers

ions

Page 22: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

25

RQ3: Can we improve traditional bug prediction models with antipatterns information?

F-m

easu

re

ARL can improve cross-system bug prediction on the two studied systems

Page 23: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.
Page 24: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

27

Backup Slides

Page 25: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

28

1.0 2.0 3.0 4.0 5.0 6.0

a.java

b.java

c.java

3 4 0 2 1 3

4 5 1 0 0 3

0 6 5 4 5 4

ANA(a.java) =2.16, ARL(a.java) = 18.76, ACPD(a.java) = 0

RQ2) Example

Page 26: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

29

Anti Singleton Blob Class Data Should be Private Complex Class Large Class

Lazy Class LPL Long Method Message Chain RPB

Spaghetti Code SG SwissArmyKnife - -

Page 27: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

30

Page 28: Ehsan Salamati Taba, Foutse Khomh, Ying Zou, Meiyappan Nagappan, Ahmed E. Hassan 1.

31

RQ1) Do antipatterns affect the density of bugs in files?

HypothesisThere is no difference between the density of future bugs of the files with antipatterns and the other files without antipatterns.

Wilcoxon rank sum test

FindingsIn general, the density of bugs in a file withantipatterns is higher than the density of bugs in a file without antipatterns.