Software Measurement Pitfalls & Best...

84
Software Measurement Pitfalls & Best Practices @EricBouwers @avandeursen @jstvssr

Transcript of Software Measurement Pitfalls & Best...

Page 1: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Measurement Pitfalls & Best Practices

@EricBouwers @avandeursen

@jstvssr

Page 2: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Introductions

Page 3: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

It takes a village …

Tiago Alves Jose Pedro Correira Christiaan Ypma Miguel Ferreira

Ilja Heitlager

Tobias Kuipers

Bart Luijten

Dennis Bijlsma

Page 4: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

And what about you?

Page 5: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Why talk about software measurement?

‘You can’t control what you can't measure.’

DeMarco, Tom. Controlling Software Projects: Management, Measurement and Estimation. ISBN 0-13-171711-1.

Page 6: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Why talk about software measurement?

‘You can’t improve what you can't measure.’

Page 7: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software measurement is used for:

•  Cost and effort estimation •  Productivity measures and models •  Data collection •  Reliability models •  Performance evaluation and models •  Structural and complexity metrics •  Capability-maturity assessments •  Management by metrics •  Evaluation of methods and tools •  Quality models and measures

Fenton et. al., Software Metrics: a rigorous and practical approach. ISBN 0534954251

Page 8: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Which software measures do you use?

Page 9: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

(Software) Measurement

Page 10: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

What is measurement?

‘Formally, we define measurement as a

mapping from the empirical world to the formal, relational world. ’

‘A measure is the number or symbol assigned to an entity by this mapping in order to

characterize an attribute’

Fenton et. al., Software Metrics: a rigorous and practical approach. ISBN 0534954251

Page 11: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Entity

Attribute

Mapping

Measure

Page 12: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Entities

Product: •  Specifications, Architecture diagrams, Designs, Code,

Test Data, …

Process: •  Constructing specification, Detailed design, Testing, ….

Resources: •  Personnel, Teams, Software, Hardware, Offices, …

Fenton et. al., Software Metrics: a rigorous and practical approach. ISBN 0534954251

Page 13: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Attributes

External

Internal Size, Structuredness,

Functionality

Usability, Reliability

Page 14: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Mapping

Park, Rober E., Software Size Measurement: A Framework for Counting Source Statements, Technical Report CMU/SEI-92-TR-020

Page 15: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Representation Condition Attribute: Size

Foo

Bar

500 LOC

100 LOC

1 File

5 Files

Large

Small

Measure Measure Measure

Page 16: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Measure - scale types

Scale type Allowed operations

Example

Nominal = , ≠ A, B, C, D, E Ordinal = , ≠, < , > Small, large Interval = , ≠, < , > , + , - Start date Ratio All LOC Absolute All -

Page 17: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Concept summary

Entity (Child)

Attribute (Length)

Measure (cm)

Mapping (feet on ground)

Page 18: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Why does this matter?

Page 19: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

It determines what you want …

Alves. Categories of Source Code in Industrial Systems, ESEM 2011: 335-338

Page 20: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

It determines who wants it …

System

Component

Module

Unit

Page 21: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Aggregation exercise From Unit to System

Page 22: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Unit measurement: T. McCabe, IEEE Transactions on Software Engineering, 1976

•  Academic: number of independent paths per method •  Intuitive: number of decisions made in a method •  Reality: the number of if statements (and while, for, ...)

McCabe: 4

Method

Page 23: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Available data

For four projects, per unit: •  Lines of Code •  McCabe complexity

In which system is unit testing a given method the most challenging?

Page 24: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Option 1: Summing

Crawljax GOAL Checkstyle Springframework Total McCabe 1814 6560 4611 22937

Total LOC 6972 25312 15994 79474

Ratio 0,260 0,259 0,288 0,288

Page 25: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Option 2: Average

Crawljax GOAL Checkstyle Springframework Average McCabe 1,87 2,45 2,46 1,99

0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 17" 18" 19" 20" 21" 22" 23" 24" 25" 27" 29" 32"

Page 26: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Cyclomatic complexity

Risk category

1 - 5 Low

6 - 10 Moderate

11 - 25 High

> 25 Very high

Sum lines of code"per category"

Lines of code per risk category

Low Moderate High Very high

70 % 12 % 13 % 5 %

Option 3: Quality profile

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#

Crawljax#

Goal#

Checkstyle#

Springframework#

Page 27: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Always take into account

Volume

Explanation

Distribution

Page 28: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Pitfalls in using measurements What are they and how to counter them?

Page 29: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

One-track metric

Page 30: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

When only looking at size …

Page 31: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Combination shows problems on different levels

Equals

HashCode

Page 32: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Metrics Galore

Page 33: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

So what should we measure?

Page 34: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

GQM

Goal Question

Question

Metric

Metric

Metric

Basili, Et. Al. , The Goal Question Metric Approach, Chapter in Encyclopedia of Software Engineering, Wiley, 1994.

Page 35: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

GQM - Example

http://www.cs.umd.edu/users/mvz/handouts/gqm.pdf

Page 36: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Treating the metric

Page 37: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Metric in a bubble

● ● ● ● ● ● ● ● ● ●

● ●

● ● ● ● ● ● ● ● ●

0.0

0.2

0.4

0.6

0.8

1.0

● ● ● ● ● ● ● ● ● ●

●● ●

● ● ● ● ●● ●

1.0 1.1 1.2 1.3 1.4 2.0 2.1 2.2 2.3 2.4 3.0 3.1 3.2 3.3 3.4 3.5 4.0 4.1 4.2 4.3 4.4 5.0 5.1

SBCSUCB

I II III IV

Page 38: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Quality as a goal

Page 39: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Quality waves

1970s •  Higher-order languages

1980s •  Design methods and tools (CASE)

1990s •  Process

2000s •  Product

2010s •  ?

“Focus on the product to improve quality” “The CMM [..] is not a quality standard” Bill Curtis ���co-author of the Capability Maturity Model (CMM), ���the People CMM, and the Business Process Maturity Model. In “CMMI: Good process doesn't always lead to good quality” Interview for TechTarget, June 2008 http://tinyurl.com/process-vs-quality

Page 40: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

The Bermuda Triangle of Software Quality

Process (organizational)

Project (individual)

People (individual)

Product

CMMI (Scampi)

Prince2 J2EE (IBM)

MCP (Microsoft)

RUP (IBM)

ISO 25010 ISO 14598

Page 41: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

ISO 25010

Page 42: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software product quality standards

Previous: separate standards Currently: ISO 25000-series, a coherent family of standards

Page 43: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

ISO/IEC 25010 Quality perspectives

Product quality

Quality in use effect of software product

software product

development (build and test)

deploy

phase Quality model

Page 44: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

ISO/IEC 25010 Software product quality characteristics

Functional Suitability Performance Efficiency Compatibility

Reliability

Portability Maintainability Security

Usability ISO 25010

Page 45: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

The sub-characteristics of maintainability in ISO 25010

Maintain

Analyze Modify Test Reuse

Modularity

Page 46: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

A Practical Model for Measuring Maintainability

Page 47: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Some requirements

Simple to understand

Allow root-cause analyses

Technology independent

Heitlager, et. al. A Practical Model for Measuring Maintainability, QUATIC 2007

Easy to compute

Page 48: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Suggestions for metrics?

Page 49: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Measuring ISO 25010 maintainability using the SIG model

Volume

Duplication

Unit size

Unit complexity

Unit interfacing

Module coupling

Component balance

Component independence

Analysability X X X X

Modifiability X X X

Testability X X X

Modularity X X X

Reusability X X

Page 50: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

From measurement to rating A benchmark based approach

Page 51: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Embedding A bechmark based approach

0.14

0.96

0.09

0.84

5% HHHHH

30% HHHHI

30% HHHII

30% HHIII

5% HIIII

0.09

0.14

0.84

0.96

…. ….

0.34

Note: example thresholds

HHIII

sort

Page 52: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

From measurements to ratings

Volume

Duplication

Unit size

Unit complexity

Unit interfacing

Module coupling

Component balance

Component independence

Analysability X X X X

Modifiability X X X

Testability X X X

Modularity X X X

Reusability X X

Page 53: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Remember the quality profiles?

1.  Compute source code metrics

per method / file / module 2.  Summarize distribution of measurement

values in “Quality Profiles”

Cyclomatic complexity

Risk category

1 - 5 Low

6 - 10 Moderate

11 - 25 High

> 26 Very high

Lines of code per risk category

Low Moderate High Very high

70 % 12 % 13 % 5 %

Sum of lines of code per category"

Heitlager, et. al. A Practical Model for Measuring Maintainability, QUATIC 2007

Page 54: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

‘First level’ calibration

Page 55: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

The formal six step proces

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010

Page 56: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Visualizing the calculated metrics

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010

Page 57: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Choosing a weight metric

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010

Page 58: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Calculate for a benchmark of systems

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010

70% 80% 90%

Page 59: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

SIG Maintainability Model Derivation metric thresholds

1.  Measure systems in benchmark 2.  Summarize all measurements

3.  Derive thresholds that bring out the metric’s variability

4.  Round the thresholds

Cyclomatic complexity

Risk category

1 - 5 Low

6 - 10 Moderate

11 - 25 High

> 26 Very high

(a) Metric distribution (b) Box plot per risk category

Fig. 10: Unit size (method size in LOC)

(a) Metric distribution (b) Box plot per risk category

Fig. 11: Unit interfacing (number of parameters)

computed thresholds. For instance, the existence of unit testcode, which contains very little complexity, will result in lowerthreshold values. On the other hand, the existence of generatedcode, which normally have very high complexity, will resultin higher threshold values. Hence, it is extremely important toknow which data is used for calibration. As previously stated,for deriving thresholds we removed both generated code andtest code from our analysis.

VIII. THRESHOLDS FOR SIG’S QUALITY MODEL METRICS

Throughout the paper, the McCabe metric was used as casestudy. To investigate the applicability of our methodology toother metrics, we repeated the analysis for the SIG qualitymodel metrics. We found that our methodology can be suc-cessfully applied to derive thresholds for all these metrics.

Figures 10, 11, 12, and 13 depict the distribution and the boxplot per risk category for unit size (method size in LOC), unitinterfacing (number of parameters per method), module inwardcoupling (file fan-in), and module interface size (number ofmethods per file), respectively.

From the distribution plots, we can observe, as for McCabe,that for all metrics both the highest values and the variabilitybetween systems is concentrated in the last quantiles.

Table IV summarizes the quantiles used and the derivedthresholds for all the metrics from the SIG quality model.As for the McCabe metric, we derived quality profiles foreach metric using our benchmark in order to verify that thethresholds are representative of the chosen quantiles. The

(a) Metric distribution (b) Box plot per risk category

Fig. 12: Module Inward Coupling (file fan-in)

(a) Metric distribution (b) Box plot per risk category

Fig. 13: Module Interface Size (number of methods per file)

results are again similar. Except for the unit interfacing metric,the low risk category is centered around 70% of the code andall others are centered around 10%. For the unit interfacingmetric, since the variability is relative small until the 80%quantile we decided to use 80%, 90% and 95% quantiles toderive thresholds. For this metric, the low risk category is around 80%, the moderate risk is near 10% and the other twoaround 5%. Hence, from the box plots we can observe thatthe thresholds are indeed recognizing code around the definedquantiles.

IX. CONCLUSION

A. ContributionsWe proposed a novel methodology for deriving software

metric thresholds and a calibration of previously introducedmetrics. Our methodology improves over others by fulfillingthree fundamental requirements: i) it respects the statisticalproperties of the metric, such as metric scale and distribution;ii) it is based on data analysis from a representative set ofsystems (benchmark); iii) it is repeatable, transparent andstraightforward to carry out. These requirements were achievedby aggregating measurements from different systems usingrelative size weighting. Our methodology was applied to alarge set of systems and thresholds were derived by choosingspecific percentages of overall code of the benchmark.

B. DiscussionUsing a benchmark of 100 object-oriented systems (C#

and Java), both proprietary and open-source, we explained

Derive & Round"

Alves, et. al., Deriving Metric Thresholds from Benchmark Data, ICSM 2010

Page 60: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

‘Second level’ calibration

Page 61: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

How to rank quality profiles? Unit Complexity profiles for 20 random systems

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

HIIII

HHIII

HHHII

HHHHI

HHHHH

?

Page 62: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Ordering by highest-category is not enough!

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

HIIII

HHIII

HHHII

HHHHI

HHHHH

?

Page 63: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

A better ranking algorithm

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

Order categories

Define thresholds of given systems

Find smallest possible

thresholds

Page 64: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Which results in a more natural ordering

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

HIIII

HHIII

HHHII

HHHHI

HHHHH

Page 65: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Second level thresholds Unit size example

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

Page 66: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

SIG Maintainability Model Mapping quality profiles to ratings

1.  Calculate quality profiles for the systems in the benchmark 2.  Sort quality profiles 3.  Select thresholds based on 5% / 30% / 30% / 30% / 5% distribution

Select thresholds" HHHHH

HHHHI

HHHII

HHIII

HIIII

Sort"

Alves, et. al., Benchmark-based Aggregation of Metrics to Ratings, IWSM / Mensura 2011

Page 67: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

SIG measurement model Putting it all together

Quality Profiles

Property Ratings ΗΗΙΙΙ

ΗΙΙΙΙ

ΗΗΗΙΙ

ΗΗΗΗΙ

ΗΗΗΗΗ

ΗΗΗΗΙ

Quality Ratings ΗΗΙΙΙ

ΗΗΙΙΙ

ΗΗΗΙΙ

ΗΗΗΗΙ

Overall Rating

ΗΗΗΙΙ

Measure-ments

c. d.

Page 68: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Does it measure maintainability?

Page 69: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

SIG Maintainability Model Empirical validation

•  The Influence of Software Maintainability on Issue Handling MSc thesis, Technical University Delft

•  Indicators of Issue Handling Efficiency and their Relation to Software Maintainability, MSc thesis, University of Amsterdam

•  Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010

16 projects (2.5 MLOC)

150 versions

50K issues

Internal: maintainability

external: Issue handling

Page 70: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Empirical validation The life-cycle of an issue

Page 71: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Empirical validation Defect resolution time

Luijten et.al. Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010

Page 72: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Empirical validation Quantification

Luijten et.al. Faster Defect Resolution with Higher Technical Quality of Software, SQM 2010

Page 73: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Empirical validation Quantification

Bijlsma. Indicators of Issue Handling Efficiency and their Relation to Software Maintainability, MSc thesis, University of Amsterdam

Page 74: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Does it measure maintainability?

Page 75: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Is it useful?

Page 76: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Risk Assessment

Analysis in SIG Laboratory

Finalpresentation

Kick-offsession

Strategysession

Technicalsession

Technicalvalidation

session

Riskvalidation

session

Final Report

Page 77: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Example Which system to use?

HHHII HHIII

HHHHI

Page 78: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Example Should we accept delay and cost overrun, or cancel the project?

User Interface

Business Layer

Data Layer

User Interface

Business Layer

Data Layer

Vendor framework

Custom

Page 79: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Monitoring

Source code

Automated analysis

Updated Website

Interpret Discuss Manage

Act!

Source code

Automated analysis

Updated Website

Interpret Discuss Manage

Act!

Source code

Automated analysis

Updated Website

Interpret Discuss Manage

Act!

Source code

Automated analysis

Updated Website

Interpret Discuss Manage

Act!

1 week 1 week 1 week

Page 80: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Software Product Certification

Page 81: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Summary

Page 82: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

Measurement challenges

Entity

Attribute

Mapping

Measure

0%# 10%# 20%# 30%# 40%# 50%# 60%# 70%# 80%# 90%# 100%#

Crawljax#

Goal#

Checkstyle#

Springframework#

Metric in a bubble Treating the metric

One-track metric Metrics galore

Meaning

# metrics

Too little Too much

Page 83: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

A benchmark based model

Volume

Duplication

Unit size

Unit complexity

Unit interfacing

Module coupling

Component balance

Component independence

Analysability X X X X

Modifiability X X X

Testability X X X

Modularity X X X

Reusability X X

(a) Metric distribution (b) Box plot per risk category

Fig. 10: Unit size (method size in LOC)

(a) Metric distribution (b) Box plot per risk category

Fig. 11: Unit interfacing (number of parameters)

computed thresholds. For instance, the existence of unit testcode, which contains very little complexity, will result in lowerthreshold values. On the other hand, the existence of generatedcode, which normally have very high complexity, will resultin higher threshold values. Hence, it is extremely important toknow which data is used for calibration. As previously stated,for deriving thresholds we removed both generated code andtest code from our analysis.

VIII. THRESHOLDS FOR SIG’S QUALITY MODEL METRICS

Throughout the paper, the McCabe metric was used as casestudy. To investigate the applicability of our methodology toother metrics, we repeated the analysis for the SIG qualitymodel metrics. We found that our methodology can be suc-cessfully applied to derive thresholds for all these metrics.

Figures 10, 11, 12, and 13 depict the distribution and the boxplot per risk category for unit size (method size in LOC), unitinterfacing (number of parameters per method), module inwardcoupling (file fan-in), and module interface size (number ofmethods per file), respectively.

From the distribution plots, we can observe, as for McCabe,that for all metrics both the highest values and the variabilitybetween systems is concentrated in the last quantiles.

Table IV summarizes the quantiles used and the derivedthresholds for all the metrics from the SIG quality model.As for the McCabe metric, we derived quality profiles foreach metric using our benchmark in order to verify that thethresholds are representative of the chosen quantiles. The

(a) Metric distribution (b) Box plot per risk category

Fig. 12: Module Inward Coupling (file fan-in)

(a) Metric distribution (b) Box plot per risk category

Fig. 13: Module Interface Size (number of methods per file)

results are again similar. Except for the unit interfacing metric,the low risk category is centered around 70% of the code andall others are centered around 10%. For the unit interfacingmetric, since the variability is relative small until the 80%quantile we decided to use 80%, 90% and 95% quantiles toderive thresholds. For this metric, the low risk category is around 80%, the moderate risk is near 10% and the other twoaround 5%. Hence, from the box plots we can observe thatthe thresholds are indeed recognizing code around the definedquantiles.

IX. CONCLUSION

A. ContributionsWe proposed a novel methodology for deriving software

metric thresholds and a calibration of previously introducedmetrics. Our methodology improves over others by fulfillingthree fundamental requirements: i) it respects the statisticalproperties of the metric, such as metric scale and distribution;ii) it is based on data analysis from a representative set ofsystems (benchmark); iii) it is repeatable, transparent andstraightforward to carry out. These requirements were achievedby aggregating measurements from different systems usingrelative size weighting. Our methodology was applied to alarge set of systems and thresholds were derived by choosingspecific percentages of overall code of the benchmark.

B. DiscussionUsing a benchmark of 100 object-oriented systems (C#

and Java), both proprietary and open-source, we explained

HHHHH

HHHHI

HHHII

HHIII

HIIII

Analysis in SIG Laboratory

Finalpresentation

Kick-offsession

Strategysession

Technicalsession

Technicalvalidation

session

Riskvalidation

session

Final Report

Page 84: Software Measurement Pitfalls & Best Practicesericbouwers.github.io/slides/2013icse_SoftwareMeasurement_Pitfall... · Why talk about software measurement? ... Software Metrics: a

The most important things to remember

Goal

Entity – Attribute – Mapping

Context

Validate in theory and practice