Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

21
29 Sept 2011 Challenge the future Delft University of Technology Using Source Code Metrics to Predict Change- Prone Java Interfaces Daniele Romano and Martin Pinzger Williamsburg, ICSM 2011

description

Paper title: Using Source Code Metrics to Predict Change-Prone Java InterfacesAuthors: Daniele Romano and Martin PinzgerSession: Research Track Session 11: Metrics

Transcript of Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

Page 1: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

29 Sept 2011

Challenge the future Delft University of Technology

Using Source Code Metrics to Predict Change-Prone Java Interfaces

Daniele Romano and Martin Pinzger Williamsburg, ICSM 2011

Page 2: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

2

Contributions

•  Correlation source code metrics vs #changes in interfaces:

•  C&K metrics

•  complexity and usage metrics

•  interface usage cohesion metric

•  Predictive power of source code metrics for interfaces:

•  prediction models

•  10 open source projects

•  8 Eclipse projects

•  Hibernate 2 and Hibernate 3

Page 3: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

3

Motivations

•  Changes in interfaces are not desirable

•  changes can have stronger impact

•  interfaces define contracts

•  existing object oriented metrics not sound for interfaces

•  Related work about metrics as quality predictors

•  no differences among the kind of class

Page 4: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

4

Hypotheses

•  H1

•  InterfaceUsageCohesion (IUC) has a stronger correlation with number of Source Code Changes (#SCC) of interfaces than the C&K metrics

•  H2

•  IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone

Page 5: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

5

The Approach

source code repository

metrics computation

Changes Retrieval

Correlation analysis

Correlation Prediction

analysis

Spearman rank correlation

H1

Metrics train models Changes classify interfaces

H2

Page 6: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

6

Metrics Computation

source code repository

Evolizer Model Importer

Computation

Famix Model

Understand Metrics Values

Page 7: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

7

Changes Computation

source code repository

Evolizer Version Control

Connector

Changes Computation

Revisions Info &

Subsequent files

Evolizer Change Distiller

Fine-Grained Source Code

Changes (SCC) AST Comparison

Page 8: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

8

Why SCC?

•  Filtering out useless changes due to modification of:

•  licenses

•  comments

•  More precise measurement

#Revision=1 #LineModified=1 #SCC=2

Page 9: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

9

C&K Correlation for Interfaces Project CB0

Hibernate3 0.535**

Hibernate2 0.373**

ecl.debug.core 0.484**

ecl.debug.ui 0.216*

ecl.jface 0.239*

ecl.jdt.debug 0.512**

ecl.team.core 0.367*

ecl.team.cvs.core 0.688**

ecl.team.ui 0.301*

update.core 0.499**

Median 0.428

NOC

0.029

0.065

0.105

0.033

0.012

0.256**

0.102

-0.013

-0.003

-0.007

0.031

RFC

0.592**

0.325**

0.486**

0.152

0.174**

0.349**

0.497**

0.738**

0.299*

0.381**

0.365

DIT

0.058

-0.01

0.232*

0.324**

0.103

-0.049

0.243

0.618**

-0.103*

0.146

0.124

LCOM

0.103

0.006

0.337

0.214*

0.320**

0.238**

0.400

0.610**

0.395**

0.482**

0.328

WMC

0.657**

0.522**

0.597**

0.131

0.137

0.489**

0.451**

0.744**

0.299*

0.729**

0.505

*= significant at α=0.05 **= significant at α=0.01

Page 10: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

10

Weighted Methods per Class (WMC)

•  ci cyclomatic complexity of the ith method

•  n number of methods in a class

Number of Methods

Page 11: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

11

Interface Segregation Principle

 ISP   defined by Robert C. Martin   cope with fat interfaces

 Fat interface   interfaces that serve different clients   each kind of client uses a different set of methods   the interface should be split in more interface, each one

designed to serve a specific client

Page 12: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

12

Different clients do not share any methods

ClusterClients(i): counts the number of clients that do not share any method of the interface i

Interface Segregation Principle (I)

Page 13: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

13

Interface Usage Cohesion

Different clients share a method

Page 14: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

14

Other metrics for interfaces…

•  Number Of Methods (NOM)

•  Number Of Arguments (NOA)

•  Arguments Per Procedure (APP)

•  Number of Clients (Cli)

•  Number of Invocations (Inv)

•  Number of Implementing Classes (Impl)

Page 15: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

15

Correlation for Interfaces Project Inv

Hibernate3 0.544**

Hibernate2 0.165

ecl.debug.core 0.317**

ecl.debug.ui 0.497**

ecl.jface 0.205

ecl.jdt.debug 0.495**

ecl.team.core 0.261

ecl.team.cvs.core 0.557**

ecl.team.ui 0.290

update.core 0.677**

Median 0.317

Cli

0.433**

0.104

0.327**

0.498**

0.099

0.471

0.278

0.608**

0.270

0.656**

0.327

NOM

0.657**

0.522**

0.597**

0.131

0.137

0.489**

0.451**

0.744**

0.299

0.729**

0.505

Clust

0.302**

0.016

0.273**

0.418**

0.106**

0.474**

0.328*

0.369

0.056

0.606**

0.328

IUC

-0.601**

-0.373**

-0.682**

-0.508**

-0.363**

-0.605**

-0.475**

-0.819**

-0.618**

-0.656**

-0.605

*= significant at α=0.05 **= significant at α=0.01

Page 16: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

16

Prediction Analysis

•  Three Machine Learning Algorithms • Support Vector Machine

• Naïve Bayes Network

• Neural Nets

•  Interfaces classification:

•  Training using 10 fold cross-validation

•  {CBO, RFC, LCOM, WMC} = CK

•  {CBO, RFC, LCOM, WMC, IUC} = IUC

Page 17: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

17

Prediction – AUC values

Project CK IUC

ecl.team.cvs.core 0.55 0.75

ecl.debug.core 0.75 0.79

ecl.debug.ui 0.66 0.72

Hibernate2 0.745 0.807

Hibernate3 0.835 0.862

ecl.jdt.debug 0.79 0.738

ecl.jface 0.639 0.734

ecl.team.core 0.708 0.792

ecl.team.ui 0.88 0.8

update.core 0.782 0.811

Median 0.747 0.791

CK

0.692

0.806

0.71

0.735

0.64

0.741

0.607

0.617

0.74

0.794

0.722

IUC

0.811

0.828

0.742

0.708

0.856

0.82

0.778

0.608

0.884

0.817

0.814

CK

0.8

0.85

0.748

0.702

0.874

0.77

0.553

0.725

0.65

0.675

0.736

IUC

0.8

0.875

0.766

0.747

0.843

0.762

0.542

0.85

0.75

0.744

0.764

NBayes LibSVN NN

Page 18: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

18

Results

•  H2 PARTIALLY ACCEPTED

•  IUC can improve the performance of prediction models to classify Java interfaces into change- and not-change-prone

•  Despite the improvements Wilcoxon test showed a significant difference only for the LibSVM

•  H1 ACCEPTED

•  IUC has a stronger correlation with #SCC of interfaces than the C&K metrics

•  UIC shows the best correlation

Page 19: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

19

Implications

•  Researchers

•  taking in account the nature of the measured entities

•  Quality Engineers

•  enlarge metrics suites

•  Developers and Architects

•  Measure the ISP violation

Page 20: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

20

Future Work

•  Metrics measurement overtime

•  Further validation

•  Are the shared methods the problem?

•  Component Based System and Service Oriented System

Page 21: Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces

21