Analyzing Code Evolution to Uncover Relations between Bad Smells

45
Analyzing Code Evolution to Uncover Relations between Bad Smells Angela Lozano, Kim Mens, Jawira Portugal @ang_loza #SANER15 PPAP 2015 More info: [email protected]

Transcript of Analyzing Code Evolution to Uncover Relations between Bad Smells

Analyzing Code Evolution to Uncover Relations between Bad Smells

Angela Lozano, Kim Mens, Jawira Portugal @ang_loza #SANER15

PPAP 2015

More info: [email protected]

Agenda• Bad smells

• Inter-smell relations

• Experiment

• Results

• Conclusions

Bad smells

Bad smells• Bad smell?: indicator that points to parts of the code

that may need to be refactored

• Bad smells analyzed:

• Feature Envy (FE)

• Long Method (LM)

• Type Checking (TC)

• God Class (GC)

Inter-smell relations

Why are relations between smells interesting?

• Easier to detect

• More complex to eliminate

• May indicate “high-level” smells

Inter-smell relations [Pietrzak & Walter, XP 06]

Plain support a —> b a = Data Class b = Feature Envy

Mutual support a <—> b ?

Rejection a -/-> b a = Lazy Class b = Large Class

Common refactoring !a —> !b ∨ !a —> b a = Feature Envy b = Lazy Class

Inclusion a ==> b a = Parallel inheritance b = Shotgun surgery

+Aggregate, +Transitive

Experiment

Data collectedr i r i+

1

Log4j 1.2.1 -1.2.17 16 releases Apr 04 - Feb 13 106 months

JMol 1.0 - 11.0 34 releases May 01 - Mar 07 70 months

JFreeChart 0.5.6 - 1.0.14 45 releases Dec 00- Nov 11 131 months

r N

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

• disappear(BS): number of SCEs (Classes orMethods) where the bad smell was present in theprevious analyzed version of the source code but isno longer present in the currently analyzed version

• co-disappear(BS1, BS2): number of SCEs(Classes or Methods) where both both bad smells BS1and BS2 disappeared from the analyzed entity withrespect to the previous analyzed version of the sourcecode

1) Indirect Metrics: Using on these basic metrics wederive additional metrics:

exist� overlap(BS1, BS2) =co� exist(BS1, BS2)

exist(BS1)

disappear � overlap(BS1, BS2) =co� disappear(BS1, BS2)

disappear(BS1)

Note that these metrics are not symmetric e.g., exist-overlap(BS1, BS2) is not necessarily the same as exist-overlap(BS2, BS1)

2) Metrics Used to Identify Relations between Bad Smells:

Relation between metrics and inter-smell relations

• Plain Support (A ! B)): if

exist-overlap(A, B) > exist-overlap(B, A)

• Mutual Support (A $ B): if

exist-overlap(A, B) ⇡ exist-overlap(B, A)

• Rejection (A 6! B): if

co-exist(A, B) ⇡ 0

• Common Refactoring (!A !!B): if

disappear-overlap(A, B) >disappear-overlap(B, A)

• Inclusion (A ) B) : if

exist-overlap(A, B) ⇡ 1

B. Bad Smells Analyzed

JDeodorant can detect bad smells by trying to identifypossible refactorings. We decided to use JDeodorant’s badsmells [7]–[9] because its detection approach is reliable foranalyses over time1. The bad smells that JDeodorant can detectare Feature Envy (FE), God Class (GC), Long Method (LM),and Type Checking (TC).

C. Case Studies

We analyzed three open source Java applications over sev-eral releases. Log4j, a logging library that is part of the ApacheFoundation. Log4j was analyzed from version 1.2.1 to 1.2.17(i.e., sixteen releases). Jmol is a viewer for chemical structures(including crystals, bio-molecules and DNA sequences) in 3D.Versions 1 until 11.0 (34 releases) were analyzed for Jmol.Finally, JFreeChart, a chart library (bar charts, pie charts, linecharts, etc.). We analyzed forty-five releases of JFreeChart(from version 0.5.6 to 1.0.14).

1Note that the source code entities that are above/below a threshold maychange from one version to the other. Also note that avoiding metrics makethe results more likely to be replicable across different case studies.

III. RESULTS

Figure 1 shows the amount of bad smells accumulatedtrough all versions analyzed. The most common bad smells inall case studies were long method (LM) and god class (GC).The least frequent bad smell is type checking (TC).

Fig. 1: Cumulative number of bad smells found per case study

Table I shows that the number of instances per type of badsmell are similar across case studies, and also that they tendto increase over time.

TABLE I: Number of bad smells per case study in the firstand last versions analyzed

App Version TC FE GC LM

Log4j 1st. 8 16 45 96Last 8 17 53 114

Jmol 1st. 10 16 47 79Last 19 35 130 200

JFreechart 1st. 17 6 21 38Last 13 35 153 199

However when looking at both (Figure 1 and Table I)we can see that the growth of bad smells varies acrossapplications. While Log4j seems to have a controlled growthof their bad smells, Jmol and JFreechart degrade over time.In particular, there seems to be an explosion of long methodsand god classes. The only case of bad smells being reducedin comparison with the first version are type checkings inJFreechart.2

A. Evidence for Inter-smell Relations

Table II shows the median result for the metrics that canidentify relations between all the pairs of bad smells analyzed.

Note that type checkings did not exist or disappear with anyother bad smell. Therefore it seems that having type checkingis a good indicator that the entity does not suffer from otherof the bad smells analyzed. That is, TC 6! FE, TC 6! LM ,and TC 6! GC.

In contrast, feature envy seems to have a very strongrelation with long methods. Results indicate that whenever afeature envy is found, that method is very likely to be a longmethod as well (i.e. FE ) LM ).

Feature envy existence overlaps more (30%, 58%, and60%) with god classes than on the other way around. Thereforewe conclude that the existence of feature envy increases thelikelihood of that class to being a god class i.e., FE ! GC.

2Although there are several versions in which the number of long methodsand god classes were reduced in JFreechart, the final count is higher than theinitial count for these two bad smells.

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

• disappear(BS): number of SCEs (Classes orMethods) where the bad smell was present in theprevious analyzed version of the source code but isno longer present in the currently analyzed version

• co-disappear(BS1, BS2): number of SCEs(Classes or Methods) where both both bad smells BS1and BS2 disappeared from the analyzed entity withrespect to the previous analyzed version of the sourcecode

1) Indirect Metrics: Using on these basic metrics wederive additional metrics:

exist� overlap(BS1, BS2) =co� exist(BS1, BS2)

exist(BS1)

disappear � overlap(BS1, BS2) =co� disappear(BS1, BS2)

disappear(BS1)

Note that these metrics are not symmetric e.g., exist-overlap(BS1, BS2) is not necessarily the same as exist-overlap(BS2, BS1)

2) Metrics Used to Identify Relations between Bad Smells:

Relation between metrics and inter-smell relations

• Plain Support (A ! B)): if

exist-overlap(A, B) > exist-overlap(B, A)

• Mutual Support (A $ B): if

exist-overlap(A, B) ⇡ exist-overlap(B, A)

• Rejection (A 6! B): if

co-exist(A, B) ⇡ 0

• Common Refactoring (!A !!B): if

disappear-overlap(A, B) >disappear-overlap(B, A)

• Inclusion (A ) B) : if

exist-overlap(A, B) ⇡ 1

B. Bad Smells Analyzed

JDeodorant can detect bad smells by trying to identifypossible refactorings. We decided to use JDeodorant’s badsmells [7]–[9] because its detection approach is reliable foranalyses over time1. The bad smells that JDeodorant can detectare Feature Envy (FE), God Class (GC), Long Method (LM),and Type Checking (TC).

C. Case Studies

We analyzed three open source Java applications over sev-eral releases. Log4j, a logging library that is part of the ApacheFoundation. Log4j was analyzed from version 1.2.1 to 1.2.17(i.e., sixteen releases). Jmol is a viewer for chemical structures(including crystals, bio-molecules and DNA sequences) in 3D.Versions 1 until 11.0 (34 releases) were analyzed for Jmol.Finally, JFreeChart, a chart library (bar charts, pie charts, linecharts, etc.). We analyzed forty-five releases of JFreeChart(from version 0.5.6 to 1.0.14).

1Note that the source code entities that are above/below a threshold maychange from one version to the other. Also note that avoiding metrics makethe results more likely to be replicable across different case studies.

III. RESULTS

Figure 1 shows the amount of bad smells accumulatedtrough all versions analyzed. The most common bad smells inall case studies were long method (LM) and god class (GC).The least frequent bad smell is type checking (TC).

Fig. 1: Cumulative number of bad smells found per case study

Table I shows that the number of instances per type of badsmell are similar across case studies, and also that they tendto increase over time.

TABLE I: Number of bad smells per case study in the firstand last versions analyzed

App Version TC FE GC LM

Log4j 1st. 8 16 45 96Last 8 17 53 114

Jmol 1st. 10 16 47 79Last 19 35 130 200

JFreechart 1st. 17 6 21 38Last 13 35 153 199

However when looking at both (Figure 1 and Table I)we can see that the growth of bad smells varies acrossapplications. While Log4j seems to have a controlled growthof their bad smells, Jmol and JFreechart degrade over time.In particular, there seems to be an explosion of long methodsand god classes. The only case of bad smells being reducedin comparison with the first version are type checkings inJFreechart.2

A. Evidence for Inter-smell Relations

Table II shows the median result for the metrics that canidentify relations between all the pairs of bad smells analyzed.

Note that type checkings did not exist or disappear with anyother bad smell. Therefore it seems that having type checkingis a good indicator that the entity does not suffer from otherof the bad smells analyzed. That is, TC 6! FE, TC 6! LM ,and TC 6! GC.

In contrast, feature envy seems to have a very strongrelation with long methods. Results indicate that whenever afeature envy is found, that method is very likely to be a longmethod as well (i.e. FE ) LM ).

Feature envy existence overlaps more (30%, 58%, and60%) with god classes than on the other way around. Thereforewe conclude that the existence of feature envy increases thelikelihood of that class to being a god class i.e., FE ! GC.

2Although there are several versions in which the number of long methodsand god classes were reduced in JFreechart, the final count is higher than theinitial count for these two bad smells.

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Data collected

r i r i+1

LM FE TC GCpackage1 Class1 method1() Xpackage1 Class1 method2(int) X

XXX

package1 Class1 method3(float,int) X Xpackage1 Class1 method4(String) X Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object) X X

LM FE TC GCpackage1 Class1 method1()package1 Class1 method2(int) X

XXpackage1 Class4 method3(float,int) Xpackage1 Class4 method4(String) Xpackage2 Class2 method5() Xpackage2 Class3 method6(Object)

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

• disappear(BS): number of SCEs (Classes orMethods) where the bad smell was present in theprevious analyzed version of the source code but isno longer present in the currently analyzed version

• co-disappear(BS1, BS2): number of SCEs(Classes or Methods) where both both bad smells BS1and BS2 disappeared from the analyzed entity withrespect to the previous analyzed version of the sourcecode

1) Indirect Metrics: Using on these basic metrics wederive additional metrics:

exist� overlap(BS1, BS2) =co� exist(BS1, BS2)

exist(BS1)

disappear � overlap(BS1, BS2) =co� disappear(BS1, BS2)

disappear(BS1)

Note that these metrics are not symmetric e.g., exist-overlap(BS1, BS2) is not necessarily the same as exist-overlap(BS2, BS1)

2) Metrics Used to Identify Relations between Bad Smells:

Relation between metrics and inter-smell relations

• Plain Support (A ! B)): if

exist-overlap(A, B) > exist-overlap(B, A)

• Mutual Support (A $ B): if

exist-overlap(A, B) ⇡ exist-overlap(B, A)

• Rejection (A 6! B): if

co-exist(A, B) ⇡ 0

• Common Refactoring (!A !!B): if

disappear-overlap(A, B) >disappear-overlap(B, A)

• Inclusion (A ) B) : if

exist-overlap(A, B) ⇡ 1

B. Bad Smells Analyzed

JDeodorant can detect bad smells by trying to identifypossible refactorings. We decided to use JDeodorant’s badsmells [7]–[9] because its detection approach is reliable foranalyses over time1. The bad smells that JDeodorant can detectare Feature Envy (FE), God Class (GC), Long Method (LM),and Type Checking (TC).

C. Case Studies

We analyzed three open source Java applications over sev-eral releases. Log4j, a logging library that is part of the ApacheFoundation. Log4j was analyzed from version 1.2.1 to 1.2.17(i.e., sixteen releases). Jmol is a viewer for chemical structures(including crystals, bio-molecules and DNA sequences) in 3D.Versions 1 until 11.0 (34 releases) were analyzed for Jmol.Finally, JFreeChart, a chart library (bar charts, pie charts, linecharts, etc.). We analyzed forty-five releases of JFreeChart(from version 0.5.6 to 1.0.14).

1Note that the source code entities that are above/below a threshold maychange from one version to the other. Also note that avoiding metrics makethe results more likely to be replicable across different case studies.

III. RESULTS

Figure 1 shows the amount of bad smells accumulatedtrough all versions analyzed. The most common bad smells inall case studies were long method (LM) and god class (GC).The least frequent bad smell is type checking (TC).

Fig. 1: Cumulative number of bad smells found per case study

Table I shows that the number of instances per type of badsmell are similar across case studies, and also that they tendto increase over time.

TABLE I: Number of bad smells per case study in the firstand last versions analyzed

App Version TC FE GC LM

Log4j 1st. 8 16 45 96Last 8 17 53 114

Jmol 1st. 10 16 47 79Last 19 35 130 200

JFreechart 1st. 17 6 21 38Last 13 35 153 199

However when looking at both (Figure 1 and Table I)we can see that the growth of bad smells varies acrossapplications. While Log4j seems to have a controlled growthof their bad smells, Jmol and JFreechart degrade over time.In particular, there seems to be an explosion of long methodsand god classes. The only case of bad smells being reducedin comparison with the first version are type checkings inJFreechart.2

A. Evidence for Inter-smell Relations

Table II shows the median result for the metrics that canidentify relations between all the pairs of bad smells analyzed.

Note that type checkings did not exist or disappear with anyother bad smell. Therefore it seems that having type checkingis a good indicator that the entity does not suffer from otherof the bad smells analyzed. That is, TC 6! FE, TC 6! LM ,and TC 6! GC.

In contrast, feature envy seems to have a very strongrelation with long methods. Results indicate that whenever afeature envy is found, that method is very likely to be a longmethod as well (i.e. FE ) LM ).

Feature envy existence overlaps more (30%, 58%, and60%) with god classes than on the other way around. Thereforewe conclude that the existence of feature envy increases thelikelihood of that class to being a god class i.e., FE ! GC.

2Although there are several versions in which the number of long methodsand god classes were reduced in JFreechart, the final count is higher than theinitial count for these two bad smells.

dissapear-overlap (LM,FE) = 1/1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

! (FE,LM) = 1/1

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

GC —> LM

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

co-exist (TC,GC) = 1 (LM,GC) = 1 (FE,GC) = 1 (LM,FE) = 1 (TC,LM) = 0 (TC,FE) = 0

TC -/-> LMTC -/-> FE

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

dissapear-overlap (LM,FE) = 1/1

! (FE,LM) = 1/1

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

GC ==> FE

GC ==> TC

LM ==> FE

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)Rejection a -/-> b co-exist(a, b) ≈ 0Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)Inclusion a ==> b exist-overlap(a, b) ≈ 1

Detecting inter-smell relations

exist (LM) = 3 (FE) = 2 (TC) = 1 (GC) = 1

dissapear (LM) = 1 (FE) = 1 (GC) = 1

co-dissapear (LM,FE) = 1

exist (LM) = 3+2=5 (FE) = 2+1=3 (TC) = 1+1=2 (GC) = 1+0=1

exist-overlap (TC,GC) = 1/2 (LM,GC) = 1/5 (FE,GC) = 1/2 (LM,FE) = 1/1

! (GC,TC) = 1/1 (GC,LM) = 1/1 (GC,FE) = 1/1 (FE,LM) = 1/3

GC ==> FE

GC ==> TC

LM ==> FE

Results

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? LM

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? LM

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

=>

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

-/->

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? GCTABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: FE ??? GCTABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

—>

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: GC ??? LM

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: GC ??? LM

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

<—>

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: GC ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: GC ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

-/->

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: LM ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

Plain support a —> b exist-overlap(a, b) >> exist-overlap(b, a)

Mutual support a <—> b exist-overlap(a, b) ≈ exist-overlap(b, a)

Rejection a -/-> b co-exist(a, b) ≈ 0

Common refactoring !a —> !b disappear-overlap(a, b) >> dissappear-overlap(b, a)

Inclusion a ==> b exist-overlap(a, b) ≈ 1

Results: LM ??? TC

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

TABLE II: Median of the metrics used to identify relations

Bad Smells App exist-overlap(A, B) exist-overlap(B, A) co-exist(A, B) disappear-overlap(A, B) disappear-overlap(B, A)

A = FE, B = LMLog4J 0.94 0. 15 15 1 0Jmol 1 0 27 0 0

JFreechart 0.92 0.1 22 0.14 0

A = FE, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = FE, B = GCLog4J 0.44 0.16 7 0 0Jmol 0.85 0.27 28 0.17 0

JFreechart 0.76 0.17 16 0 0

A = GC, B = LMLog4J 0.75 0.34 33 0 0Jmol 0.83 0.51 82 0.27 0.39

JFreechart 0.75 0.4 73 0 0.06

A = GC, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

A = LM, B = TCLog4J 0 0 0 0 0Jmol 0 0 0 0 0

JFreechart 0 0 0 0 0

This difference between exist-overlaps is lower (40%, 30%,and 35%) for god classes co-existing with long methods, so itis difficult to decide whether god classes support long methods(GC ! LM ) or if there is mutual support (GC $ LM ). Weconcluded that the relation between these two bad smells ismutual support because there is not enough evidence to con-clude a common refactoring (even though some refactoringseliminate both of them).

B. Discussion

This study ignores the analysis of bad smells that cannotbe found by JDeodorant (and their corresponding relations).That is, bad smells for which it could not find an automaticrefactoring, or whose refactoring does not ’improve’ the sourcecode entity analyzed.

Moreover, the fact that these bad smells are located byproposing a refactoring alternative must be taken into accountto understand the results. Therefore, a method with featureenvy should not be interpreted as a method that uses moremethods of other classes than its own, but rather a methodthat should be moved to a specific class. JDeodorant’s longmethods can be divided, while the control and data flow oflong methods in the general case may be tangled so that theamount of long methods analyzed might be just a fractionthan those that have issues. Similarly, the god classes analyzedhere correspond to classes that have sets of methods that aredisjoint among themselves (i.e., a class whose methods canbe separated into cohesive subsets). However, the definitioncommonly used is more vague: classes that represent mul-tiple abstractions or have too many responsibilities. Finally,methods with type checkings are those methods that can beimplemented with a strategy or template pattern. In this case,JDeodorant’s definition is again more precise than using toomany switches or checks against the type of objects beforecalling the appropriate methods.

Providing explanations for the results requires to considerthe specific definitions of these bad smells. It is easy to conjec-ture possible relations between two bad smells. For instance,if a method contains type-checks it is likely to be longerthan the average method, therefore, type-checks support longmethods. Nevertheless, where the specific definitions are takeninto account the results obtained seem evident. For instance,

there is no reason for methods that should be implemented withthe template or the strategy pattern to be longer, require moreservices of classes other than its own, or be located in classesthat handle multiple abstractions/responsibilities (TC 6! FE,TC 6! LM , and TC 6! GC). Whenever a method islocated in the wrong class it is likely to be dealing withseparable data/control flow paths (FE ) LM ). Methods thatare wrongly located are likely to be a part of a disjunct setof methods within their class (FE ! GC). Finally, separablemethods are likely to belong to classes with disjunct sets ofmethods and classes with disjunct sets of methods are likelyto have methods that are handling more than one data/controlflow path (GC $ LM ).

C. Threats to Validity

The following aspects may affect the validity of theseresults:

1) Internal Validity: As mentioned in the discussion sec-tion, the bad smells detected are just a subset of those that adeveloper (or another tool) may have encounter. Therefore, it islikely that with another experiment setup the relations amongbad smells analyzed are found to be stronger. Notice thatJDeodorant, relies on eclipse’s AST which implies that onlythe revisions that compiled were analyzed. However, giventhan the variability among the metrics analyzed was low, weare confident that the revisions missing do not have a signifi-cant impact in the results. This study does not perform originanalysis which means that for each version we count whichentities had each bad smell. In order, to identify which badsmells were removed, a simple name comparison is performedagainst the entities with bad smells from the previous revision.Therefore, if there were source code entities that were renamedor moved and their bad smells remained unchanged, they areidentified twice as ‘deleted’ and ‘added’ bad smells.

2) External Validity: Although the applications analyzedbelong to different domains these results may only holdfor small to medium sized, Java, open source applications.Analyzing the limitations of these results for similar projects,larger projects, in other programming languages, or within anindustrial setting remains as future work.

3) Construct Validity: We think this study has little con-cerns regarding the relation between the evidence gathered and

-/->

Conclusions

Conclusions

• If TC then neither LM, FE or GC. If LM, FE or GC then there is not TC.

• GC indicates LM and FE.

• FE only occurs in LM

LM FE TC GCLM <== <-/- <—>FE <-/- <—TC -/->GC

a —> b Plain supporta <—> b Mutual supporta -/-> b Rejection

!a —> !b Common refactoringa ==> b Inclusion

FE: Feature Envy

LM: Long Method

TC: Type Checking

GC: God Class

Conventions

Read the table

in this direction

Conclusions

• If TC then neither LM, FE or GC. If LM, FE or GC then there is not TC.

• GC indicates LM and FE.

• FE only occurs in LM

LM FE TC GCLM <== <-/- <—>FE <-/- <—TC -/->GC

GC

LM

FE

TC

Detection

a —> b Plain supporta <—> b Mutual supporta -/-> b Rejection

!a —> !b Common refactoringa ==> b Inclusion

FE: Feature Envy

LM: Long Method

TC: Type Checking

GC: God Class

Conventions

Read the table

in this direction

Conclusions

• If TC then neither LM, FE or GC. If LM, FE or GC then there is not TC.

• GC indicates LM and FE.

• FE only occurs in LM

LM FE TC GCLM <== <-/- <—>FE <-/- <—TC -/->GC

GC

LM

FE

TC

Detection FE

LM

GC

TCRemoval

a —> b Plain supporta <—> b Mutual supporta -/-> b Rejection

!a —> !b Common refactoringa ==> b Inclusion

FE: Feature Envy

LM: Long Method

TC: Type Checking

GC: God Class

Conventions

Read the table

in this direction