Pragmatic Design Quality Assessment - (Tutorial at ICSE 2008)
-
Upload
tudor-girba -
Category
Technology
-
view
5.384 -
download
1
description
Transcript of Pragmatic Design Quality Assessment - (Tutorial at ICSE 2008)
Pragmatic Design Quality Assessment
Tudor GîrbaUniversity of Bern, Switzerland
Michele LanzaUniversity of Lugano, Switzerland
Radu MarinescuPolitehnica University of Timisoara, Romania
1946
1951
1951
1951
1951
1951 2008
1951 2008
1951 2008
?1951 2008
Software is complex.
The Standish Group, 2004
53% Challenged
18% Failed
29% Succeeded
How large is your project?
How large is your project?
1’000’000 lines of code
How large is your project?
1’000’000 lines of code
* 2 = 2’000’000 seconds
How large is your project?
1’000’000 lines of code
* 2 = 2’000’000 seconds
/ 3600 = 560 hours
How large is your project?
1’000’000 lines of code
* 2 = 2’000’000 seconds
/ 3600 = 560 hours
/ 8 = 70 days
How large is your project?
1’000’000 lines of code
* 2 = 2’000’000 seconds
/ 3600 = 560 hours
/ 8 = 70 days
/ 20 = 3 months
But, code is for the computer.
Why would we ever read it?
forward engineering
}
{
}
{
}
{
}
{
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
What is the current state?
What should we do?
Where to start?
How to proceed?
reve
rse
engin
eerin
gforward engineering
}
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
actual development
Reverse engineering is analyzing a subject system to:
identify components and their relationships, and
create more abstract representations.
Chikofky & Cross, 90
}
{
}
{
}
{}
{
}
{
A large system contains lots of details.
}
{
}
{
}
{}
{
}
{
A large system contains lots of details.
How to judge its quality?
http://moose.unibe.ch
http://loose.upt.ro/incode
1Software innumbers
2Software in
pictures
3Software in
time
4Software in
tools
1Software in numbers
You cannot controlwhat you cannot measure.
Tom de Marco
Metrics are functions that assign numbers to
products, processes and resources.
Software metrics are measurements which
relate to software systems, processes or
related documents.
Metrics compress system traits into numbers.
Let’s see some examples...
Examples of size metrics
NOM - number of methods
NOA - number of attributes
LOC - number of lines of code
NOS - number of statements
NOC - number of children
Lorenz, Kidd, 1994Chidamber, Kemerer, 1994
McCabe, 1977
McCabe cyclomatic complexity (CYCLO) counts the number of independent paths through the code of a function.
interpretation can’t directly lead to improvement action
it reveals the minimum number of tests to write
Chidamber, Kemerer, 1994
Weighted Method Count (WMC) sums up the complexity of class’ methods (measured by the metric of your choice; usually CYCLO).
interpretation can’t directly lead to improvement action
it is configurable, thus adaptable to our precise needs
Chidamber, Kemerer, 1994
Depth of Inheritance Tree (DIT) is the (maximum) depth level of a class in a class hierarchy.
only the potential and not the real impact is quantified
inheritance is measured
Coupling between objects (CBO) shows the number of classes from which methods or attributes are used.
Chidamber, Kemerer, 1994
no differentiation of types and/or intensity of coupling
it takes into account real dependencies not just declared ones
Tight Class Cohesion (TCC) counts the relative number of method-pairs that access attributes of the class in common.
Bieman, Kang, 1995
TCC = 2 / 10 = 0.2
ratio values allow comparison between systems
interpretation can lead to improvement action
...
McCall, 1977
Metrics Assess and Improve Quality!
Metrics Assess and Improve Quality!
Really?
McCall, 1977
?Problem 1: metrics granularity
capture symptoms, not causes of problems
in isolation,they don’t lead to improvement solutions
?Problem 2: implicit mapping
we don’t reason in terms of metrics, but in terms of design principles
Problem 1: metrics granularity
capture symptoms, not causes of problems
in isolation,they don’t lead to improvement solutions
2 big obstacles in using metrics:
Thresholds make metrics hard to interpret
Granularity make metrics hard to use in isolation
Can metrics help me in what I really care for? :)
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
How do I improve myself?
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
How do I improve myself?
I want nothing to do with metrics!
How to get an initial understanding of a system?
Metric ValueLOC 35175
NOM 3618
NOC 384
CYCLO 5579
NOP 19
CALLS 15128
FANOUT 8590
AHH 0.12
ANDC 0.31
Metric ValueLOC 35175
NOM 3618
NOC 384
CYCLO 5579
NOP 19
CALLS 15128
FANOUT 8590
AHH 0.12
ANDC 0.31
Metric ValueLOC 35175
NOM 3618
NOC 384
CYCLO 5579
NOP 19
CALLS 15128
FANOUT 8590
AHH 0.12
ANDC 0.31And now what?
We need means to compare.
hierarchies?
coupling?
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
Size Communication
Inheritance
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
Size
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
Communication
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
Inheritance
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
...
HIGH
0.30
16
15
10
9
0.25
AVG
C++
4
5
0.20
LOW
Java
AVGLOW HIGH
0.24
10
13
7
0.20
10
0.16
7
4NOM/NOC
LOC/NOM
CYCLO/LOC
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
close to high close to average close to low
The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006
close to high close to average close to low
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
How do I improve myself?
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
How do I improve myself?
I want nothing to do with metrics!
How do I improve code?
Breaking design principles, rules and best practices
deteriorates the code;
it leads to design problems.
Quality is more than 0 bugs.
Imagine changing just a small design fragment
Imagine changing just a small design fragment
and 33%of all classes would require changes
Imagine changing just a small design fragment
Design problems areexpensivefrequentunavoidable
Design problems areexpensivefrequentunavoidable
How to detect and eliminate them?
God Classes tend to centralize the intelligence of the system, to do everything and to use data from small data-classes.
Riel, 1996
God Classes tendto centralize the intelligence of the system,to do everything andto use data from small data-classes.
God Classescentralize the intelligence of the system,do everything anduse data from small data-classes.
God Classesare complex,are not cohesive,access external data.
God Classesare complex,are not cohesive,access external data.
Compose metrics into queries using
logical operato
rs
Detection Strategies are metric-based queries to detect design flaws.
METRIC 1 > Threshold 1
Rule 1
METRIC 2 < Threshold 2
Rule 2
AND Quality problem
Lanza, Marinescu 2006
God
Class
Brain
Class
Feature
Envy
Data
Class
Brain
Method
Significant
Duplication
Intensive
Coupling
Extensive
Coupling
Shotgun
Surgery
Tradition
Breaker
Refused
Parent
Bequest
uses
has
is
has
has
has (partial)
is partially
has
is
is
has
Futile
Hierarchy
uses
has
has
is
has (subclass)
Classification
Disharmonies
Identity
Disharmonies
Collaboration
Disharmonies
Lanza, Marinescu 2006
A God Class centralizes too much intelligence in the system.
ATFD > FEW
Class uses directly more than a
few attributes of other classes
WMC ! VERY HIGH
Functional complexity of the
class is very high
TCC < ONE THIRD
Class cohesion is low
AND GodClass
Lanza, Marinescu 2006
An Envious Method is more interested in data from a handful of classes.
ATFD > FEW
Method uses directly more than
a few attributes of other classes
LAA < ONE THIRD
Method uses far more attributes
of other classes than its own
FDP ! FEW
The used "foreign" attributes
belong to very few other classes
AND Feature Envy
Lanza, Marinescu 2006
Data Classes are dumb data holders.
WOC < ONE THIRD
Interface of class reveals data
rather than offering services
AND Data Class
Class reveals many attributes and is
not complex
Lanza, Marinescu 2006
Data Classes are dumb data holders.
AND
OR
Class reveals many
attributes and is not
complex
NOAP + NOAM > FEW
More than a few public
data
WMC < HIGH
Complexity of class is not
high
NOAP + NOAM > MANY
Class has many public
data
WMC < VERY HIGH
Complexity of class is not
very high
AND
Lanza, Marinescu 2006
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
forward engineering
actual development }
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
How do I understand code?
How do I improve code?
How do I improve myself?
How do I improve myself?
Follow a clear and repeatable process
Follow a clear and repeatable process
Follow a clear and repeatable process
Don’t reason about quality in terms of numbers!
Follow a clear and repeatable process
QA is part of the the Development Process
http://loose.upt.ro/incode
Can we understand the beauty of a paintingby measuring its frame or counting its colors?
1Software innumbers
2Software in
pictures
3Software in
time
4Software in
tools
2Software in pictures
Software is beautiful
1854,London,choleraepidemic
1812, Napoleon’s Campaign in Russia
Numbers..
Numbers..
Numbers..
0.31ANDC
NOM
20.21 19
0.12
35175
NOP
NOC
418
0.15
8590
LOC
3618
9.42
5579
NOM
CALLS15128
384
FANOUT
9.72
0.56
AHH
CYCLO
Visualization compresses the system into pictures.
A picture is worth
a thousand words...
...depends on the picture
anonymous
Lanza
Software visualiza
tion is more than UML
We are
visualbeings ...
... and we’regood atspottingpatterns
How many groups do you see?
How many groups do you see?
How many groups do you see?
How many groups do you see?
Gestalt principles
proximity
enclosure connectivity
similarity
More Gestalt principles
closure continuity
We do not see with our eyes, but with our brain.
Our brain works like a computer, with 3 types of memory
Iconic memory, the visual sensory register
Short-term memory, the working memory
Long-term memory
Sensation(Physical Process)
Perception(Cognitive Process)
Stimulus Sensory Organ Perceptual Organ
Brain
Iconic Memory - Short-term Memory - Long-term Memory
Iconicmemory
Short-termmemory
Iconicmemory
Short-termmemory
< 1 secondvery fastautomaticsubconscious
preattentive
Iconicmemory
Short-termmemory
< 1 secondvery fastautomaticsubconscious
preattentive
couple of seconds3-9 chunks
Categorizing Preattentive Attributes
Category Form Color Spatial Motion Motion
Attribute
Orientation Hue 2D position Flicker
Line length Intensity Direction
Line width
Size
Shape
Curvature
Added marks
Enclosure
Attributes of Form
Orientation Line Length Line Width Size
Shape Curvature Added Marks Enclosure
Attributes of Form
Line Length Line Width Size
Shape Curvature Added Marks Enclosure
Attributes of Form
Line Width Size
Shape Curvature Added Marks Enclosure
Attributes of Form
Size
Shape Curvature Added Marks Enclosure
Attributes of Form
Shape Curvature Added Marks Enclosure
Attributes of Form
Curvature Added Marks Enclosure
Attributes of Form
Added Marks Enclosure
Attributes of Form
Enclosure
Attributes of Form
Exemplifying Preattentive Processing
Exemplifying Preattentive Processing
8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598
Exemplifying Preattentive Processing
8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598
8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598
70%of all externalinputs comethrough the eyes
Software visualization isthe use of the crafts of typography, graphic design, animation, and cinematography with modern human-computer interaction and computer graphics technology to facilitate both the human understanding and effective use of computer software.
Price, Becker, Small
Static Visualization
Dynamic Visualization
no silver bullet
Software is complex
Software is complex
A picture is worth
a thousand words.
UML took it literally
:)
Example: what is ?
Polymetric Views show up to 5 metrics.
Color metric
Width metric
Height metric
Position metrics
Lanza, 2003
A simple & powerful concept
LOC
NOS
lines
parameters
parameters
System Complexity shows class hierarchies.
lines
attributes
methods
Lanza, Ducasse, 2003
Class Blueprint shows class internals.
Initialize Interface Internal Accessor Attribute
invocation and access direction
Lanza, Ducasse, 2005
Class Blueprint has a rich vocabulary.
Regular
Overriding
Extending
Abstract
Constant
Delegating
Setter
Getter
Method
invocations
lines
Attribute
internal access
externalaccess
Access
Invocation
Class Blueprint reveals patterns.
schizophrenic classtwin classes
Distribution Map shows properties over structure. Ducasse etal, 2006
31 parts, 394 elements and 9 properties
Softwarenaut explores the package structure.Lungu etal, 2006
Code City shows where your code lives.Wettel, Lanza, 2007
classes are buildings grouped in quarters of packages
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Jmol - The Time Machine
Software is beautiful
1Software innumbers
2Software in
pictures
3Software in
time
4Software in
tools
3Software in time
reve
rse
engin
eerin
gforward engineering
}
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
actual development
reve
rse
engin
eerin
gforward engineering
}
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
actual development
reve
rse
engi
neer
ing
}
{
}
{
}
{}
{
}
{
A large system contains lots of details.
}
{
}
{
}
{}
{
}
{
}
{
}
{
}
{}
{
}
{
}
{
}
{
}
{}
{
}
{
The history of a large system contains even more details.
}
{
}
{
}
{}
{
}
{
}
{
}
{
}
{}
{
}
{
Lehman etal, 2001
Most often time is put on the horizontaland a property on the vertical axis.
Spectographs show change activity.Wu etal, 2004
commit
time
Evolution Matrix shows changes in classes.
Idle class
Pulsar class
Supernova class
White dwarf class
Lanza, Ducasse, 2002
Evolution Matrix shows changes in classes.Lanza, Ducasse, 2002
History can be measured.
2 4 3 5 7
2 2 3 4 9
2 2 1 2 3
2 2 2 2 2
1 5 3 4 4
What changed? When did it change? ...
1 5 3 4 4
4 2 1 0+++ = 7=
LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-nEvolution ofNumber of Methods
LENOM(C)
1 5 3 4 4
LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n
LENOM(C) 4 2-3 2 2-2 1 2-1 0 20+++ = 1.5=
EENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 22-i
Latest Evolution ofNumber of Methods
Earliest Evolution ofNumber of Methods
EENOM(C) 4 20 2 2-1 1 2-2 0 2-3+++ = 5.25=
ENOM LENOM EENOM
7 3.5 3.25
7 5.75 1.37
3 1 2
0 0 0
7 1.25 5.25
2 4 3 5 7
2 2 3 4 9
2 2 1 2 3
2 2 2 2 2
1 5 3 4 4
ENOM LENOM EENOM
7 3.5 3.25
7 5.75 1.37
3 1 2
0 0 0
7 1.25 5.25
balanced changer
late changer
dead stable
early changer
ENOM LENOM EENOM
7 3.5 3.25
7 5.75 1.37
3 1 2
0 0 0
7 1.25 5.25
balanced changer
late changer
dead stable
early changer
History can be measured.
Evolution
Stability
Historical Max
Growth Trend
...
Number of Methods
Number of Lines of Code
Cyclomatic Complexity
Number of Modules
...
of
History can be measured in many ways.
The recently changed parts are likely to change in the near future.
Common wisdom
The recently changed parts are likely to change in the near future.
Common wisdom
Are they really?
30% 90%
present
present
past
present
past future
present
past future
present
past future
present
past future
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
present
past future
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
prediction hit
Yesterday’s Weather shows the localization of changed in time. Girba etal, 2004
hit hit hit
YW = 3 / 8 = 37%
hit hit hit hit hit hit hit
YW = 7 / 8 = 87%
A God Class centralizes too much intelligence in the system.
ATFD > FEW
Class uses directly more than a
few attributes of other classes
WMC ! VERY HIGH
Functional complexity of the
class is very high
TCC < ONE THIRD
Class cohesion is low
AND GodClass
Lanza, Marinescu, 2006
A God Class centralizes too much intelligence in the system.
ATFD > FEW
Class uses directly more than a
few attributes of other classes
WMC ! VERY HIGH
Functional complexity of the
class is very high
TCC < ONE THIRD
Class cohesion is low
AND GodClass
Lanza, Marinescu, 2006
But, what if it is
stable?
History-based Detection Strategies take evolution into account. Ratiu etal, 2004
AND
isGodClass(last)
God Class
in the last version
Stability > 90%
Stable throughout
the history
Harmless God Class
What happens with inheritance?
ver .1 ver. 2 ver. 3 ver. 4 ver. 5
A A A A A
B B B B BC C C
D D D E
A is persistent, B is stable, C was removed, E is newborn ...
Hierarchy Evolution encapsulates time.
A
B
D
C
E
age
changedmethods
changedlines
Removed
Removed
Girba etal, 2005
A is persistent, B is stable, C was removed, E is newborn ...
Hierarchy Evolution reveals patterns.Girba etal, 2005
Gall etal, 2003
Co-change analysis recovers hidden dependencies.Time is the lines.
Evolution Radar shows co-change relationships.D’Ambros, Lanza 2006
one package and its co-change relationships
Software is developed by people.
CVS shows activity.
Who is responsible for this?
Who is responsible for this?
Alphabetical order is no order.
Ownership Map reveals development patterns.Girba etal, 2006
JEdit
Ant
(john 23.06.03) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 09.01.05) if (!isActive()) {(bill 09.01.05) return false(bill 09.01.05) }(steve 16.02.05) List offs = i.getOffenders();(john 23.06.03) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(john 23.06.03) boolean res = offs.equals(newOffs);(john 23.06.03) return res;
(george 13.02.05) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 11.13.05) if (!isActive()) {(bill 11.13.05) return false(bill 11.13.05) }(steve 16.02.05) List offs = i.getOffenders();(george 13.02.05) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(george 13.02.05) boolean res = offs.equals(newOffs);(george 13.02.05) return res;
Who copied from whom?
(john 23.06.03) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 09.01.05) if (!isActive()) {(bill 09.01.05) return false(bill 09.01.05) }(steve 16.02.05) List offs = i.getOffenders();(john 23.06.03) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(john 23.06.03) boolean res = offs.equals(newOffs);(john 23.06.03) return res;
(george 13.02.05) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 11.13.05) if (!isActive()) {(bill 11.13.05) return false(bill 11.13.05) }(steve 16.02.05) List offs = i.getOffenders();(george 13.02.05) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(george 13.02.05) boolean res = offs.equals(newOffs);(george 13.02.05) return res;
What is useless?
13.02.05 public boolean stillValid (ToDoItem I, Designer dsgr) {11.13.05 if (!isActive()) {11.13.05 return false11.13.05 }16.02.05 List offs = i.getOffenders();13.02.05 Object dm = offs.firstElement();16.02.05 ListSet newOffs = computeOffenders(dm);13.02.05 boolean res = offs.equals(newOffs);13.02.05 return res;
23.06.03 public boolean stillValid (ToDoItem I, Designer dsgr) {09.01.05 if (!isActive()) {09.01.05 return false09.01.05 }16.02.05 List offs = i.getOffenders();23.06.03 Object dm = offs.firstElement();16.02.05 ListSet newOffs = computeOffenders(dm);23.06.03 boolean res = offs.equals(newOffs);23.06.03 return res;
When did changes happen?
Clone Evolution shows how developers copy.Balint etal, 2006
reve
rse
engin
eerin
gforward engineering
}
{
}
{
}
{
}
{}
{
}
{
}
{}
{
}
{
actual development
reve
rse
engi
neer
ing
1Software innumbers
2Software in
pictures
3Software in
time
4Software in
tools
4Software in tools
http://loose.upt.ro/incode
http://www.inf.unisi.ch/phd/wettel/codecity.html
http://moose.unibe.ch
Pragmatic Design Quality Assessment
Tudor GîrbaUniversity of Bern, Switzerland
Michele LanzaUniversity of Lugano, Switzerland
Radu MarinescuPolitehnica University of Timisoara, Romania
Tudor Gîrba, Michele Lanza, Radu Marinescu
http://creativecommons.org/licenses/by/3.0/