CSMR12.ppt

CSMR 2012

Salima Hassaine,Yann-Gael

Gueheneuc, SylvieHamel, Giuliano

Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

ADvISE: Architectural Decay InSoftware Evolution

Salima Hassaine, Yann-Gael Gueheneuc, Sylvie Hamel,Giuliano Antoniol

SOCCER Lab. and Ptidej Team – DGIGL, Ecole Polytechnique deMontreal, Quebec, Canada

March 30, 2012

Pattern Trace Identification, Detection, and Enhancement in JavaSOftware Cost-effective Change and Evolution Research Lab

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Context and Motivation

I Software evolution is the on-going enhancement ofexisting software systems, involving both developmentand maintenance.

I As these systems are adapted to changing requirements,they may suffer from signs of aging.

I Their architecture can deviate substantially from thearchitecture originally designed i.e., their architecturetends to decay with time and it becomes lessadaptable to new, emerging requirements.

2 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Architectural Decay

I Architectural Decay is defined as“the deviation ofactual architecture from planned or original design”.[1,2]

[1] L. Hochstein and M. Lindvall, Combating architectural degeneration: asurvey. Information Software Technology, vol.47, pp.643-656, 2005.

[2] B. Williams and J. Carver, Characterizing software architecture changes:

An initial study. Empirical software Engineering and Measurement,

pp.410-419, 2007.3 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Existing approaches

I Software StabilityI Several authors suggested stability or resilience as a

primary criterion for evaluating an architecture.I However, none of these approaches have been used to

study the signs of software aging or architectural decay.

I Architectural DecayI Other authors suggested that decayed architectures

make their systems more prone to defects and that,I in some not-so-rare cases, a system architecture and its

implementation code must be thrown away because it istoo hard to maintain, unless the decay can be stoppedbefore the architecture is completely unworkable

4 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Our goal

I Propose a novel approach ADvISE and a set ofmeasures to evaluate architectural decay.

I Propose a diagram matching technique to identifystructural changes among versions of architectures.

I Propose a set of metrics to identify the signs ofarchitectural decay.

5 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Approach overview

6 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Step 1: Pre-processing

I Reverse-engineer class diagrams from the source code ofobject oriented programs using PADL tool.

I Convert the program model into a string representationusing EPI tool.

C

B

A

D

F

E

G

cr cr cr

co

in

ag as

in

in

in

(b) UML-like model

C D

A

B

E

F G

dm dm

cr

dmdm

cr cr

co

dm

in

ag as

in

in

in

(c) Eulerian model

(d) String representation of the Eulerian model

Figure: The conversion of a class diagram into string.

7 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Step 2: Class Renaming Detection (1/3)

I Structure-based Similarity

Let S(CA) and S(CB) to be the set of methods, attributes,and relationships of class CA (respectively CB). Thestructure-based similarity of CA and CB is given by:

StrS(CA,CB) =2× |S(CA) ∩ S(CB)||S(CA) ∪ S(CB)|

∈ [0, 1]

Our algorithm reports, given CA, the CB with the highestStrS similarity score as the class renamed from CA.

8 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion


I Camel-based Similarity

Let T (CA) and T (CB) to be the set of tokens of CA

(respectively CB) names, using a Camel Case Splitter.

CamelS(CA,CB) =2× |T (CA) ∩ T (CB)||T (CA) ∪ T (CB)|

∈ [0, 1]

I Levenstein-based Similarity

We use the normalized edit distance (ND), given by:

ND(CA,CB) =LEV (CA,CB)

sum(length(CA), length(CB)∈ [0, 1]

where LEV computes the Levenshtein

9 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion


I Combination of Similarities

1. First, we compare the structure-based similarities(StrS) of a candidate renamed class CA to many targetclasses {CB1 , ...,CBn}.

2. Second, we select the set of target classes having thehighest StrS value. Then, we compute their textualsimilarities (ND and CamelS).

3. Third, if we do not find a target class that has thelowest ND and the highest CamelS . Then, for eachtarget class {CB1 , ...,CBn}, we compare ND andCamelS similarities to given thresholds.

4. Finally, if none of the target classes has ND lower thanthe 0.40 threshold and CamelS higher than the 0.50threshold. Then, we can consider that class CA wasdeleted and not renamed.

10 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Step 3: Architectural Diagram Matching

Bit-Vector AlgorithmI We build the characteristic vectors of each token in the

string representation of version 1.I We read the string representation of version 2, triplets

by triplets. For each triplets T in version 2, we verifywhether it exists in version 1, as follows:

11 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Step 4: Architectural Diagram Clustering

12 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Step 5: Architectural Decay Evaluation

I We perform a pairwise matching of subsequent programarchitectures.

I We use the sets of stable triplets as indicators of thearchitectural stability, as follows:

I Stability of the original design: we compute thenumber of triplets that has a match in all the versions.These triplets are considered to be part of a tunnel, thebackbone part of the system.

I Stability of the architecture with an enrichedfunctionality: we compute the number of triplets thathave not changed since their first appearance in a givenversion of a system.

13 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Empirical Study Design

I Goal: to show the applicability and usefulness of ourapproach

I Purpose: to provide an approach for identifying classrenaming and evaluating architectural decay

I Quality focus: the accuracy of the architectural decayevaluation

I Perspective: both researchers, who want to study classrenaming, and practitioners who analyse softwareevolution to estimate the effort required for futuremaintenance tasks.

I Context: three open source systems: JFreeChart, Rhinoand Xerces-J.

14 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Research Questions (1/2)

RQ1: What are signs of architectural decay and how canthey be tracked down?

I We investigate whether the numbers of stable tripletsare good indicators to measure the architectural decay,and if they provide useful insights to developersregarding the signs of software aging.

15 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Research Questions (2/2)

RQ2: Do stable and unstable micro-architectures have thesame risk to be fault prone?This question leads to the following null hypothesis:

I H0: The proportions of faults carried by stable andunstable micro-architectures are the same.

16 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Analysis Methods (1/2)

RQ1:

I We perform a pair by pair matching of subsequentprogram architectures to identify stable triplets inJFreeChart and Xerces-J

I We study the graph of architectures evolution for eachsystem to assess whether, these indicators provide ususeful insights regarding the signs of software aging.

17 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Analysis Methods (2/2)

RQ2:

I To attempt rejecting H0, we test whether theproportion of classes that compose unstable(respectively stable) micro-architectures take part (ornot) in significantly more faults than those in stable(respectively unstable) micro-architectures.

I We use the contingency tables to assess the direction ofthe difference, if any.

I We use Fisher’s exact test, to check whether thedifference is significative.

18 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Study Results (1/3)


(a) The evolution of JFreeChart architecture

19 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Study Results (2/3)


(b) The evolution of XercesJ architecture

20 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Study Results (3/3)

RQ2: Do stable and unstable micro-architectures have thesame risk to be fault prone?

Faulty classes Clean classes

Unstable classes 105 14Stable classes 39 17

Fisher’s test 0.005Odd-ratio 3.244

Table: Contingency table and Fisher test results for unstableclasses with at least one fault.

I We can answer to RQ2 as follows: we showed thatstable micro-architectures, belonging to the originaldesign, are significantly less bug-prone than unstablemicro-architectures.

21 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Conclusion (1/2)

I Architectural decay is defined as the deviation from anoriginal design, i.e., the violation of architecture causedby the process of evolution.

I When evolution occurs in an uncontrolled manner, thesystems become more complex over time and thus,harder to maintain.

I Thus, decayed architectures make their systems moreprone to defects We proposed an approach to evaluatethe architectural decay.

22 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Conclusion (2/2)

I We proposed a set of structure-based and text-basedsimilarities to identify class renamings in evolvingarchitectures.

I We proposed a bit-vector and incremental clusteringalgorithms to perform the matching between severalversions of an architecture and find stablemicro-architectures, which exist in all versions.

I We proposed the numbers of common triplets betweenseveral versions of an architecture as indicators of decay,and thus, predictors of fault proneness.

I We also perform a quantitative and two qualitativestudies, to show the applicability and usefulness of ourapproach.

I Future work: Apply our approach to other programs toconfirm our observations.

23 / 23

CSMR 2012



Antoniol

Introduction

Approach

Empirical Study

Study Results

Conclusion

Questions?

24 / 23

CSMR12.ppt

Technology

Transcript of CSMR12.ppt