MTAT.03.243 Software Engineering Management · Coupling Lower Cohesion Higher Cyclomatic Complexity...
Transcript of MTAT.03.243 Software Engineering Management · Coupling Lower Cohesion Higher Cyclomatic Complexity...
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
MTAT.03.243
Software Engineering Management
Lecture 07:
SPI & Measurement –
Part A
Dietmar Pfahl
email: [email protected] Spring 2015
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Announcement – ATI Career Day
Friday
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Announcement –
Industry Guest Lecture
Increasing the predictability of software delivery with lean processes by Marek Laasik (VP Engineering at Fortumo)
Monday 23 March
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Project – Short Presentation on March 25
• Purpose:
– to present initial ideas about the improvement project you
intend to plan, and
– to get feedback regarding suitability
• Duration: 3-5 min / max. 3 slides
• Content:
– (1) Context of the proposed improvement project,
– (2) Issues to be addressed/resolved and corresponding
improvement goals,
– (3) Optional: Sketch of the process changes you suggest
to make in order to achieve the improvement goals
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Structure of Lecture 7
• Motivation and Definitions (Measure, Measurement)
• Example Measures (Process, Product, Resource)
• Subjectve Measurement
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Project Planning and Control
All activities aim at matching the current course of the sw project with the planned course of the SW Project
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Types of Process Models
Processes
Engineering Processes Non-Engineering Processes
Product-Engineering
Processes Process-Engineering
Processes Business
Processes Social
Processes
Development
Processes
Maintenance
Processes
Project Mgmt
Processes
Quality Mgmt
Processes
Conf Mgmt
Processes Product Line
Processes
Improvement
Processes
Process Modeling
Processes
Measurement
Processes
Process Taxonomy
Software
Knowledge
Process
Models
Product
Models
Quality
Models
Life Cycle
Models
Engineering Process Models
Business Process Models
Social Process Models
Technical Process Models
Managerial Process Models
Process Engineering
Proc. Models
. . .
. . .
PROFES
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
SPI Planning and Control
All activities aim at matching the current course of the SPI project with the planned course of the SPI Project (SPIP)
SPIP Start
SPIP Planning SPIP Control
SPIP Steering
SPIP Enactment SPIP End
SPIP = SPI Project
Process Improvement
Model
Context
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Measurement in PROFES
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Definitions:
Measurement and Measure
Measurement:
• Measurement is the process through which values (e.g., numbers) are assigned to attributes of entities of the real world.
Measure:
• A measure is the result of the measurement process, so it is the assignment of a value to an entity with the goal of characterizing a specified attribute.
Source: Sandro Morasca, “Software Measurement”, in “Handbook of Software Engineering and Knowledge Engineering - Volume 1: Fundamentals” (refereed book), pp. 239 - 276, Knowledge Systems Institute, Skokie, IL, USA, 2001, ISBN: 981-02- 4973-X.
4 e *
3 d *
2 c *
1 b *
0 a *
A B
Scale & Unit
LOC (lines of code)
Size Measure
Entity: Program Attribute: Size
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Software Measurement Challenge
• Measuring physical properties (attributes):
entity attribute unit* scale (type) value range*
Human Height cm ratio 178 (1, 300)
Human Temperature C° interval 37 (30, 45)
• Measuring non-physical properties (attributes):
entity attribute unit* scale (type) value range*
Human Intelligence/IQ index ordinal 135 [0, 200]
Program Modifiability ? ? ? ?
• Software properties are usually non-physical:
size, complexity, functionality, reliability, maturity, portability, flexibility, understandability, maintainability, correctness, testability, coupling, coherence, interoperability, …
‘unit’ and ‘range’ are sometimes used synonymously with ‘scale’
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Measurement – What is meaningful?
Some statements:
1. ‘I am twice as tall as you!’
2. ‘In Madrid it’s twice as hot (on average) as in Tartu during summer!’
3. ‘Software X is more complex than software Y!’
4. ‘Software X is twice as complex as software Y!’
5. ‘On average, our software has complexity 3.45!’
6. ‘On average, our software has ‘high’ complexity!’
• Which statements are meaningful?
• What statistics (e.g., mode, median, mean) and what statistical tests could be applied (e.g., parametric vs. non-parametric)?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Measurement Scale Types
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Measurement Scale Types – cont’d
The classification of scales has an important impact on their practical use, in particular on the statistical techniques and indices that can be used. Example: Indicator of central tendency of a distribution of values (“Location”).
Mode = most frequent value of distribution Median = the value such that not more than 50% of the values of the distribution are less than the median and not more than 50% of the values of the distribution are greater than the median
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Scale Types and Meaningful Measurement
• Scales are defined through their admissible transformations
• Scales (and their admissible transformations) help us decide
– whether a statement involving measures is meaningful
– what type of statistical analyses we can apply
• Definition of Meaningfulness:
A statement S with measurement values (i.e., measures m1,
…, mn) is meaningful iff its truth or falsity value is invariant
under admissible transformations Tr.
iff: “if and only if”
Tr(S[m1, …, mn]) is true iff S[Tr(m1), …, Tr(mn)] is true
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness of Measurement-Based Statements
Definition:
A statement involving measures is
meaningful, if its truth value remains
unchanged under any admissible
transformation of its scale type
Example:
“In Madrid, during summer, it’s on average
twice as hot as in Tartu” (measured on the
Celsius scale: e.g., 40 C vs. 20 C)
-> Meaningful?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness of Measurement-Based Statements
Procedure to check for meaningfulness:
1. Apply the admissible transformation to
measures in a statement S and obtain a
transformed statement S’.
2. If S’ can be shown to be equivalent to S,
then the statement S is meaningful for the
scale associated with the admissible
transformation.
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness of Measurement-Based Statements
Example:
“In Madrid, during summer, it’s on average twice as
hot as in Tartu” (measured on the Celsius scale: e.g.,
40 C vs. 20 C) -> Meaningful?
---
Statement: TM = 2*TT
The Celsius scale is of type ’interval’ (m’=a*m + b, a>0)
To check:
TM’ = 2*TT’ (?) under assumption that TM = 2*TT is true
Proof:
(1) TM’ = a*TM+b = a*(2*TT)+b
(2) 2*TT’ = 2*(a*TT+b) = a*(2*TT)+2*b
We see: (1) = (2) only if b=0
-> easy to construct counter-example with b<>0
Thus: Statement is not meaningful
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Example Interval Scales: Fahrenheit & Celsius
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 1
• Is statement (1) on the right meaningful, if X is measured on a ratio scale?
(1)
(2)
2
21 xx
Ratio Scale
2
21 xaxa
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 1
• Is statement (1) on the right meaningful, if X is measured on a ratio scale?
• Apply any admissible transformation M’=aM (a>0) for ratio scales:
(1)
(2)
2
21 xx
Ratio Scale
2
)()( 21 xaxa
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 1
• Is statement (1) on the right meaningful, if X is measured on a ratio scale?
• Apply any admissible transformation M’=aM (a>0) for ratio scales:
• By arithmetic manipulation, (2) can always be made equivalent to Tr(1) using any admissible transformation Tr. Therefore, the first statement is meaningful for a ratio scale.
(1)
(2)
mxx
2
21
Ratio Scale
maxaxa
2
21
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 2
• Is statement (1) on the right meaningful, if X is measured on an interval scale?
(1)
mxx
2
21
Interval Scale
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 2
• Is statement (1) on the right meaningful, if X is measured on an interval scale?
• Apply any admissible transformation M’=aM+b (a>0) for interval scales:
• By arithmetic manipulation, (2) can always be made equivalent to Tr(1). Therefore, the first statement is meaningful for an interval scale.
(1)
(2)
mxx
2
21
Interval Scale
bmabxabxa
2
21
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Example 3
• Is statement (1) on the right meaningful, if X is measured on an ordinal scale?
• Apply an admissible transformation for ordinal scales, e.g., x’=x3:
• For any pair of measurements x1 and x2, there exists always one admissible transformation such that statement (2) is false when (1) is true. Therefore, statement (1) is not meaningful for an ordinal scale.
(1)
(2)
mxx
2
21
Ordinal Scale
3
213
3
2
3
1
22
xxm
xx
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Meaningfulness – Geometric Mean
• The geometric mean of a data set [a1,
a2, ..., an] is given by
• On which scale type is the geometric
mean meaningful?
Scale Type ?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Structure of Lecture 7
• Motivation and Definitions (Measure, Measurement)
• Example Measures (Process, Product, Resource)
• Subjective Measurement
MTAT.03.094 / Lecture 13 / © Dietmar Pfahl 2014
Measurable Entities in a SW Process (Model)
An entity can represent any of the following:
• Process/Activity: any activity (or set of
activities) related to software
development and/or maintenance (e.g.,
requirements analysis, design, testing) –
these can be defined at different levels
of granularity
• Product/Artifact: any artifact produced or
changed during software development
and/or maintenance (e.g., source code,
software design documents)
• Resources: people, time, money,
hardware or software needed to perform
the processes
Activity
Productout
Productin
Ressourcetool
Ressourcerole
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Examples of Software Product Attributes
• Size
–Length, Complexity,
Functionality
• Modularity
• Cohesion
• Coupling
• Quality
• Value (Price)
• ...
• Quality (-> ISO 9126)
–Functionality
–Reliability
–Usability
–Efficiency
–Maintainability
–Portability
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Product Measure – Ex. 1: ’Code Size’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Code module
• Size (or better: Length)
• Netto Lines of Code (NLOC)
• Ratio
• (0, ∞)
• Developer
• Once, at end of week
• Using tool ’CoMeas’
• Project Manager
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Product Measure – Ex. 2: ’Code Quality 1’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Code module (class file)
• Quality (or better: Correctness)
• Defects (Def)
• Ratio
• [0, ∞)
• Developer
• Continuously during unit testing
• Using defect reporting tool ’TRep’
• Project Manager
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Product Measure – Ex. 3: ’Code Quality 2’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Code module (file)
• Quality (or better: Defect Density)
• Def / NLOC
• Ratio
• [0, ∞)
• Developer
• Continuously during unit testing
• Using tools ’TRep’ and ’CoMeas’
• Project Manager
MTAT.03.094 / Lecture 13 / © Dietmar Pfahl 2014
Common OO Code Measures
Measure Desirable Value
Coupling Lower
Cohesion Higher
Cyclomatic Complexity Lower
Method Hiding Factor Higher
Attribute Hiding Factor Higher
Depth of Inheritance Tree Low (tradeoff)
Number of Children Low (tradeoff)
Weighted Methods Per Class Low (tradeoff)
Number of Classes Higher (with ident functionality)
Lines of Code (net and total; comment) Lower (with ident functionality)
Churn (new + changed LoC) Lower (with ident functionality)
MTAT.03.094 / Lecture 13 / © Dietmar Pfahl 2014
Complexity – McCabe
Measure Desirable
Value
Description
Cyclomatic
Complexity
(CC)
Lower Defines the number of independent (simple) paths in a Control Flow
Graph (CFG).
Draw CFG, then calculate CC as follows:
CC = #(edges) – #(nodes) + 2
CC = #(decisions) + 1
CC = 5 + 1 = 6
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Direct vs. Indirect (Derived) Measures
• Direct measure: a measure that directly characterizes an empirical
property and does not require the prior measurement of some other
property
• Indirect measure: uses one or more (direct or indirect) measures of one
or more attributes in order to measure, indirectly, another supposedly
related attribute.
– Requires first the measurement of two or more attributes,
– then it combines them using a mathematical model.
speed = distance / time [km/h] accuracy = ( |actual – estimate| / estimate ) * 100% [Percentage]
Is ’estimate’ a measure?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Indirect Measures
• Examples:
– Defect Density (DD)
– Reliability (Rel)
– Productivity (Prod)
• Scale type of an indirect measure M will generally be the
weakest of the scale types of the direct measures M1, …,
Mn
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Indirect Measures
• Examples:
– DD = Quality 1 / Size [Unit: #Def/NLOC]
– Reliability = Quality 1 / Time [#Def/hour]
– Productivity 1 = Size / Time [NLOC/hour]
– Productivity 2 = Size / Effort [NLOC/person-hour]
– ...
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Subjective – Objective
Quantitative – Qualitative
Subjective Objective
Qualitative (nominal, ordinal)
?
?
Quantitative (interval, ratio)
?
?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Subjective – Objective
Quantitative – Qualitative
• Assume you measure 8 times the same attribute of the
same entity (A: size [LOC] – B: complexity [?])
1. A: 120 – A: 120 – A: 119 ---- B: 4 – B: 4 – B: high
2. A: 124 – A: 120 – A: 121 ---- B: 4 – B: 4 – B: high
3. A: 120 – A: 120 – A: 124 ---- B: 4 – B: 5 – B: very high
4. A: 120 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
5. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 3 – B: medium
6. A: 120 – A: 120 – A: 122 ---- B: 4 – B: 4 – B: high
7. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
8. A: 124 – A: 120 – A: 124 ---- B: 4 – B: 4 – B: high
Six different Measurement Series
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Subjective – Objective
Quantitative – Qualitative
• Assume you measure 8 times the same attribute of the
same entity (A: size [LOC] – B: complexity [?])
1. A: 120 – A: 120 – A: 119 ---- B: 4 – B: 4 – B: high
2. A: 124 – A: 120 – A: 121 ---- B: 4 – B: 4 – B: high
3. A: 120 – A: 120 – A: 124 ---- B: 4 – B: 5 – B: very high
4. A: 120 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
5. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 3 – B: medium
6. A: 120 – A: 120 – A: 122 ---- B: 4 – B: 4 – B: high
7. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
8. A: 124 – A: 120 – A: 124 ---- B: 4 – B: 4 – B: high
Guess: Columns 1 to 5 are Quantitative BUT: Columns 4&5 Might be Labels (not Numbers)
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Subjective – Objective
Quantitative – Qualitative
• Assume you measure 8 times the same attribute of the
same entity (A: size [LOC] – B: complexity [?])
1. A: 120 – A: 120 – A: 119 ---- B: 4 – B: 4 – B: high
2. A: 124 – A: 120 – A: 121 ---- B: 4 – B: 4 – B: high
3. A: 120 – A: 120 – A: 124 ---- B: 4 – B: 5 – B: very high
4. A: 120 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
5. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 3 – B: medium
6. A: 120 – A: 120 – A: 122 ---- B: 4 – B: 4 – B: high
7. A: 124 – A: 120 – A: 120 ---- B: 4 – B: 4 – B: high
8. A: 124 – A: 120 – A: 124 ---- B: 4 – B: 4 – B: high
Guess: Columns 2 and 4 are Objective BUT: What if Column 4 Had value ’high’?
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Types and Uses of Measures
• Types of Measures
– Direct vs. Indirect
– Subjective vs. Objective
• Has to do with
measurement process
(human involvement)
(reliability)
– Qualitative vs.
Quantitative
• Has to do with scale type
• Uses of Measures
– Assessment vs.
Prediction
NB: Measurement for
prediction requires a
prediction model
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Measurable Entities in a SW Process (Model)
An entity can represent any of the following:
• Process/Activity: any activity (or set of
activities) related to software
development and/or maintenance (e.g.,
requirements analysis, design, testing) –
these can be defined at different levels of
granularity
• Product/Artifact: any artifact produced or
changed during software development
and/or maintenance (e.g., source code,
software design documents)
• Resources: people, time, money,
hardware or software needed to perform
the processes
Activity
Productout
Productin
Ressourcetool
Ressourcerole
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Examples of Software Process and Resource
Attributes that can be measured
• Process-related:
• Efficiency:
• How fast (time, duration), how much effort (effort, cost), how much
quantity/quality per time or effort unit (velocity, productivity)?
• Effectiveness:
• Do we get the results (quantity/quality) we want? – e.g., test coverage
• Capability: CMMI level
• Resource-related:
• People: Skill, knowledge, experience, learning, motivation, personality
• Organisation: Maturity
• Method/Technique/Tool: Effectiveness, efficiency, learnability, cost
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Process Measure – Ex. 1: ’Acceptance Test Time’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Acceptance Test
• Time (or ’Duration’)
• Calendar Day
• Interval or Ratio
• [0, ∞)
• Customer XYZ
• At end of every test day
• Using reporting template ’RT’
• Product Owner
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Process Measure – Ex. 2: ’Coding Effort’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Coding
• Effort
• Person-hour
• Ratio
• [0, ∞)
• Developer
• At end of every work day
• Using reporting template ’RE’
• Project Manager
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
‘Time’ versus ‘Effort’
Time:
• Entity: Some Activity (e.g., Test)
• Attribute: Time (or Duration)
• Unit: Year, Month, Week,
(Work) Day, Hour, Minute,
Second, ...
• Range: [0, ∞)
• Scale type: ratio
• Characterisation:
• Direct
• Quantitative
• Objective/Subjective ???
• Effort:
• Entity: Some Activity (e.g., Test)
• Attribute: Effort
• Unit: Person-Year, …, Person-
Day, Person-Hour, …
• Range: [0, ∞)
• Scale type: ratio
• Characterisation:
• Direct
• Quantitative
• Objective/Subjective ???
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Effort vs. Time Trade-Off
Person Effort = 4 person-days (pd)
1 1 1 1
Person Effort = 4 pd
1 1 1 1
Person Effort = 4 pd
1
1 1
1
Day
Day
Day
What does it mean when I say:
• ”This task takes 4 days”
• ”This task needs 4 person-days”
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Agile Measurement:
Sprint Burndown Chart – Example
Sprint
Backlog
(Task List)
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Agile Measurement: Burn-Down & Burn-Up
Both can be used to calculate (average) team velocity = Story Points (or: Storys) per Team per Sprint
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Agile Measurement: Velocity [Story Points / Sprint]
Solid agile teams have consistent velocity (+/- 20%) Fluctuations? -> Look to stabilize team / environment Velocity trending up/down? -> Look at technical debt handling (rework) and team dynamics ...
Story Point (or: Task)
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Resource Measure – Ex. 1: ’Programming
Skill’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Team Member
• Programming Skill
• Programming Test Score (PTS)
• Ordinal
• [0, 1, 2, 3, 4, 5] – NB: Each
number needs explanation!
• Skill Test Agency
• Whenever a test is conducted
• Programming Test
• Certification Agency
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Resource Measure – Ex. 2: ’Personality’
• Entity
• Attribute
• Unit
• Scale Type
• Range
• Who collects/reports the data?
• When (how often) is the data collected?
• How is the data collected?
• Who is responsible for data validity?
• Team Member
• Personality
• Myer Briggs Type Indicator (MBTI)
• Nominal
• Set of 16 Types: ISTJ, ISFJ, INFJ,
..., ENTJ
• Test Agency
• Whenever a test is conducted
• MBTI Instrument
• Certification Agency
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Structure of Lecture 7
• Motivation and Definitions (Measure, Measurement)
• Example Measures (Process, Product, Resource)
• Subjective Measurement
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Objective vs. Subjective Measurement
• Objective Measurement
– Usually, the measurement
process can be automated
– (Almost) no random
measurement error, i.e., the
process is perfectly reliable
• Subjective Measurement
– Human involvement in the
measurement process
– If we repeat the
measurement of the same
object(s) several times, we
might not get exactly the
same measured value every
time, i.e., the measurement
process is not perfectly
reliable
Rule of Thumb:
• Subjective measures have proven to be useful – but if an objective measure is available, then it is (usually) preferable
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Procedures for Subjective Measurement
• Subjective Measures usually entail a well-defined
Measurement Procedure that precisely
describes:
–How to collect the data (usually via
questionnaires on paper or online)
–How to conduct interviews
–How to review documents (software artifacts)
–In which order to assess the dimensions/items
of the data collection instrument, etc.
• Examples: ISO9000 Audit, CMMI/SPICE
Assessment, Function Points
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Objective vs. Subjective Measurement
Examples:
• Subjective Measurement
– Classification of defects into severity classes
– Function Points (when counted manually)
– Software Process Assessments
• Objective Measurement
– Lines of Code
– Cyclomatic Complexity
– Memory Size
– Test Coverage
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Basic Concepts in Subjective Measurement
• Construct: A conceptual object that cannot be
directly observed and therefore cannot be directly
measured (i.e., we estimate the quantity we are
interested in rather than directly measure it); for
example:
–User Satisfaction
–Competence of a Software Engineer
–Efficiency of a Process
–Maturity of an Organization
• Item: A subjective measurement scale that is used to
measure a construct
–A question on a questionnaire is an item
Construct
Item1
Itemn
.
.
.
Measurement
Instrument
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Dimensionality of Constructs
• Constructs can be one-dimensional or multi-dimensional
• If a construct is multidimensional, then each dimension covers a
different and distinct aspect of the construct
–e.g., the different dimensions of customer satisfaction
Construct
Item1
Itemn
.
.
.
One-Dimensional
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Likert Type Scales
• Evaluation-type
Example:
“Familiarity with and
comprehension of the
software development
environment”
Little
Unsatisfactory
Satisfactory
Excellent
• Frequency-type
Example:
“Customers provide
information to the
project team about the
requirements”
Never
Rarely
Occasionally
Most of the
time
• Agreement-type
Example:
“The tasks supported
by the software at the
customer site change
frequently”
Strongly Agree
Agree
Disagree
Strongly
Disagree
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Semantic Differential Scale
• Items which include semantic opposites
• Example:
“Processing of change requests to existing systems or services:
the time that MIS staff takes until responding to change requests
received from users of existing computer-based information
systems or services.”
Slow □ □ □ □ □ □ □ Fast
Timely □ □ □ □ □ □ □ Untimely
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Assigning numbers to scale responses
• Likert-Type Scales:
Strongly Agree -> 1
Agree -> 2
Disagree -> 3
Strongly Disagree -> 4
• Ordinal Scale
• But:
Often the distances between the four response categories are approximately (conceptually) equidistant and thus are treated like approximate interval scales.
• Semantic Differential Scale:
Slow □ □ □ □ □ □ □ Fast
1 2 3 4 5 6 7
• Ordinal scale, but again, often treated
as interval scales
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Reliability versus Validity
Assume you measure several times the same attribute of an entity (say, complexity of a code module) and the centre point is the true (but unknown) value.
http://www.uni.edu/chfasoa/reliabilityandvalidity.htm
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Reliability versus Validity
Assume you measure several times the same attribute of an entity (say, complexity of a code module) and the centre point is the true (but unknown) value.
Not reliable: too much random bias (noise) Not valid: too much systematic bias
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Reliability Estimation Techniques – Classes
• Number of
administrations is
the number of times
that the same object is
measured (per
observer)
• Number of
instruments is the
number of different
but equivalent
instruments that would
need to be
administered
Number of Instruments
One Two
Number of
Administrations
(per Observer /
Rater)
One Inter-Rater
Internal
Consistency
Parallel Forms
(immediate)
Two Test-Retest Parallel Forms
(delayed)
http://www.socialresearchmethods.net/kb/reltypes.php
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Inter-Rater Agreement vs.
Internal Consistency
• Example
Book 1 Book 2
Book 4 Book 3
4 Books 2 Reviewers 1 Instrument – 4 Items
Quality
Readability
Weight
Suspense
Length
bad
little
good
much
short
light
long
heavy
1 2 3 4 5 R1
R2
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Inter-Rater Agreement vs.
Internal Consistency
Example – Data Average Inter-Item Correlation
R1:
R2: Inter-rater Agreement (Readability):
R1: R: 2 – 4 – 2 – 4
R2: R: 3 – 3 – 2 – 4
Fleiss’ Kappa = 0.33
(fair agreement)
Book 1: Q: - R: 2 - S: 3 - L: 3 - W: 3
Book 2: Q: - R: 4 - S: 3 - L: 2 - W: 2
Book 3: Q: - R: 2 - S: 3 - L: 1 - W: 2
Book 4: Q: - R: 4 - S: 5 - L: 4 - W: 3
Book 1: Q: - R: 3 - S: 3 - L: 3 - W: 3
Book 2: Q: - R: 3 - S: 4 - L: 3 - W: 2
Book 3: Q: - R: 2 - S: 1 - L: 2 - W: 2
Book 4: Q: - R: 4 - S: 4 - L: 3 - W: 3
R: S: L: W: R: 1 S: 0.66 1 L: 0.51 0.64 1 W: 0.29 0.46 0.73 1
Avg = 0.55
Quality rating: 1-2-3-4 Book 1: 0 1 1 0 Book 2: 0 0 1 1 Book 3: 0 0 2 0 Book 4: 0 0 0 2
MTAT.03.243 / Lecture 07 / © Dietmar Pfahl 2015
Next Lecture
• Topic: SPI & Measurement – Part B
• For you to do:
– Have a look at the PROFES Quick Reference and
Manual -> What does it say about measurement?
– Finish and submit Homework 2
• Deadline: March 16, 20:00 (sharp!)
– Prepare your short presentation (March 25)
• Submit slides (max 3) at the latest by March 24 (23:59)