Pragmatic Design Quality Assessment - (Tutorial at ICSE 2008)

Post on 22-Jun-2015

5.386 views 1 download

Tags:

description

This set of slides was used for the tutorial given by Tudor Girba, Michele Lanza and Radu Marinescu at International Conference on Software Engineering (ICSE) 2008.

Transcript of Pragmatic Design Quality Assessment - (Tutorial at ICSE 2008)

1946

1951

1951

1951

1951

1951 2008

1951 2008

1951 2008

?1951 2008

Software is complex.

The Standish Group, 2004

53% Challenged

18% Failed

29% Succeeded

How large is your project?

How large is your project?

1’000’000 lines of code

How large is your project?

1’000’000 lines of code

* 2 = 2’000’000 seconds

How large is your project?

1’000’000 lines of code

* 2 = 2’000’000 seconds

/ 3600 = 560 hours

How large is your project?

1’000’000 lines of code

* 2 = 2’000’000 seconds

/ 3600 = 560 hours

/ 8 = 70 days

How large is your project?

1’000’000 lines of code

* 2 = 2’000’000 seconds

/ 3600 = 560 hours

/ 8 = 70 days

/ 20 = 3 months

But, code is for the computer.

Why would we ever read it?

forward engineering

}

{

}

{

}

{

}

{

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

What is the current state?

What should we do?

Where to start?

How to proceed?

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

Reverse engineering is analyzing a subject system to:

identify components and their relationships, and

create more abstract representations.

Chikofky & Cross, 90

}

{

}

{

}

{}

{

}

{

A large system contains lots of details.

}

{

}

{

}

{}

{

}

{

A large system contains lots of details.

How to judge its quality?

1Software innumbers

2Software in

pictures

3Software in

time

4Software in

tools

1Software in numbers

You cannot controlwhat you cannot measure.

Tom de Marco

Metrics are functions that assign numbers to

products, processes and resources.

Software metrics are measurements which

relate to software systems, processes or

related documents.

Metrics compress system traits into numbers.

Let’s see some examples...

Examples of size metrics

NOM - number of methods

NOA - number of attributes

LOC - number of lines of code

NOS - number of statements

NOC - number of children

Lorenz, Kidd, 1994Chidamber, Kemerer, 1994

McCabe, 1977

McCabe cyclomatic complexity (CYCLO) counts the number of independent paths through the code of a function.

interpretation can’t directly lead to improvement action

it reveals the minimum number of tests to write

Chidamber, Kemerer, 1994

Weighted Method Count (WMC) sums up the complexity of class’ methods (measured by the metric of your choice; usually CYCLO).

interpretation can’t directly lead to improvement action

it is configurable, thus adaptable to our precise needs

Chidamber, Kemerer, 1994

Depth of Inheritance Tree (DIT) is the (maximum) depth level of a class in a class hierarchy.

only the potential and not the real impact is quantified

inheritance is measured

Coupling between objects (CBO) shows the number of classes from which methods or attributes are used.

Chidamber, Kemerer, 1994

no differentiation of types and/or intensity of coupling

it takes into account real dependencies not just declared ones

Tight Class Cohesion (TCC) counts the relative number of method-pairs that access attributes of the class in common.

Bieman, Kang, 1995

TCC = 2 / 10 = 0.2

ratio values allow comparison between systems

interpretation can lead to improvement action

...

McCall, 1977

Metrics Assess and Improve Quality!

Metrics Assess and Improve Quality!

Really?

McCall, 1977

?Problem 1: metrics granularity

capture symptoms, not causes of problems

in isolation,they don’t lead to improvement solutions

?Problem 2: implicit mapping

we don’t reason in terms of metrics, but in terms of design principles

Problem 1: metrics granularity

capture symptoms, not causes of problems

in isolation,they don’t lead to improvement solutions

2 big obstacles in using metrics:

Thresholds make metrics hard to interpret

Granularity make metrics hard to use in isolation

Can metrics help me in what I really care for? :)

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

How do I improve myself?

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

How do I improve myself?

I want nothing to do with metrics!

How to get an initial understanding of a system?

Metric ValueLOC 35175

NOM 3618

NOC 384

CYCLO 5579

NOP 19

CALLS 15128

FANOUT 8590

AHH 0.12

ANDC 0.31

Metric ValueLOC 35175

NOM 3618

NOC 384

CYCLO 5579

NOP 19

CALLS 15128

FANOUT 8590

AHH 0.12

ANDC 0.31

Metric ValueLOC 35175

NOM 3618

NOC 384

CYCLO 5579

NOP 19

CALLS 15128

FANOUT 8590

AHH 0.12

ANDC 0.31And now what?

We need means to compare.

hierarchies?

coupling?

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

Size Communication

Inheritance

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

Size

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

Communication

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

Inheritance

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

...

HIGH

0.30

16

15

10

9

0.25

AVG

C++

4

5

0.20

LOW

Java

AVGLOW HIGH

0.24

10

13

7

0.20

10

0.16

7

4NOM/NOC

LOC/NOM

CYCLO/LOC

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

close to high close to average close to low

The Overview Pyramid provides a metrics overview. Lanza, Marinescu 2006

close to high close to average close to low

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

How do I improve myself?

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

How do I improve myself?

I want nothing to do with metrics!

How do I improve code?

Breaking design principles, rules and best practices

deteriorates the code;

it leads to design problems.

Quality is more than 0 bugs.

Imagine changing just a small design fragment

Imagine changing just a small design fragment

and 33%of all classes would require changes

Imagine changing just a small design fragment

Design problems areexpensivefrequentunavoidable

Design problems areexpensivefrequentunavoidable

How to detect and eliminate them?

God Classes tend to centralize the intelligence of the system, to do everything and to use data from small data-classes.

Riel, 1996

God Classes tendto centralize the intelligence of the system,to do everything andto use data from small data-classes.

God Classescentralize the intelligence of the system,do everything anduse data from small data-classes.

God Classesare complex,are not cohesive,access external data.

God Classesare complex,are not cohesive,access external data.

Compose metrics into queries using

logical operato

rs

Detection Strategies are metric-based queries to detect design flaws.

METRIC 1 > Threshold 1

Rule 1

METRIC 2 < Threshold 2

Rule 2

AND Quality problem

Lanza, Marinescu 2006

God

Class

Brain

Class

Feature

Envy

Data

Class

Brain

Method

Significant

Duplication

Intensive

Coupling

Extensive

Coupling

Shotgun

Surgery

Tradition

Breaker

Refused

Parent

Bequest

uses

has

is

has

has

has (partial)

is partially

has

is

is

has

Futile

Hierarchy

uses

has

has

is

has (subclass)

Classification

Disharmonies

Identity

Disharmonies

Collaboration

Disharmonies

Lanza, Marinescu 2006

A God Class centralizes too much intelligence in the system.

ATFD > FEW

Class uses directly more than a

few attributes of other classes

WMC ! VERY HIGH

Functional complexity of the

class is very high

TCC < ONE THIRD

Class cohesion is low

AND GodClass

Lanza, Marinescu 2006

An Envious Method is more interested in data from a handful of classes.

ATFD > FEW

Method uses directly more than

a few attributes of other classes

LAA < ONE THIRD

Method uses far more attributes

of other classes than its own

FDP ! FEW

The used "foreign" attributes

belong to very few other classes

AND Feature Envy

Lanza, Marinescu 2006

Data Classes are dumb data holders.

WOC < ONE THIRD

Interface of class reveals data

rather than offering services

AND Data Class

Class reveals many attributes and is

not complex

Lanza, Marinescu 2006

Data Classes are dumb data holders.

AND

OR

Class reveals many

attributes and is not

complex

NOAP + NOAM > FEW

More than a few public

data

WMC < HIGH

Complexity of class is not

high

NOAP + NOAM > MANY

Class has many public

data

WMC < VERY HIGH

Complexity of class is not

very high

AND

Lanza, Marinescu 2006

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

forward engineering

actual development }

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

How do I understand code?

How do I improve code?

How do I improve myself?

How do I improve myself?

Follow a clear and repeatable process

Follow a clear and repeatable process

Follow a clear and repeatable process

Don’t reason about quality in terms of numbers!

Follow a clear and repeatable process

Can we understand the beauty of a paintingby measuring its frame or counting its colors?

1Software innumbers

2Software in

pictures

3Software in

time

4Software in

tools

2Software in pictures

Software is beautiful

1854,London,choleraepidemic

1812, Napoleon’s Campaign in Russia

Numbers..

Numbers..

Numbers..

0.31ANDC

NOM

20.21 19

0.12

35175

NOP

NOC

418

0.15

8590

LOC

3618

9.42

5579

NOM

CALLS15128

384

FANOUT

9.72

0.56

AHH

CYCLO

Visualization compresses the system into pictures.

A picture is worth

a thousand words...

...depends on the picture

anonymous

Lanza

Software visualiza

tion is more than UML

We are

visualbeings ...

... and we’regood atspottingpatterns

How many groups do you see?

How many groups do you see?

How many groups do you see?

How many groups do you see?

Gestalt principles

proximity

enclosure connectivity

similarity

More Gestalt principles

closure continuity

We do not see with our eyes, but with our brain.

Our brain works like a computer, with 3 types of memory

Iconic memory, the visual sensory register

Short-term memory, the working memory

Long-term memory

Sensation(Physical Process)

Perception(Cognitive Process)

Stimulus Sensory Organ Perceptual Organ

Brain

Iconic Memory - Short-term Memory - Long-term Memory

Iconicmemory

Short-termmemory

Iconicmemory

Short-termmemory

< 1 secondvery fastautomaticsubconscious

preattentive

Iconicmemory

Short-termmemory

< 1 secondvery fastautomaticsubconscious

preattentive

couple of seconds3-9 chunks

Categorizing Preattentive Attributes

Category Form Color Spatial Motion Motion

Attribute

Orientation Hue 2D position Flicker

Line length Intensity Direction

Line width

Size

Shape

Curvature

Added marks

Enclosure

Attributes of Form

Orientation Line Length Line Width Size

Shape Curvature Added Marks Enclosure

Attributes of Form

Line Length Line Width Size

Shape Curvature Added Marks Enclosure

Attributes of Form

Line Width Size

Shape Curvature Added Marks Enclosure

Attributes of Form

Size

Shape Curvature Added Marks Enclosure

Attributes of Form

Shape Curvature Added Marks Enclosure

Attributes of Form

Curvature Added Marks Enclosure

Attributes of Form

Added Marks Enclosure

Attributes of Form

Enclosure

Attributes of Form

Exemplifying Preattentive Processing

Exemplifying Preattentive Processing

8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598

Exemplifying Preattentive Processing

8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598

8789364082376403128764532984732984732094873290845389274-0329874-32874-23198475098340983409832409832049823-0984903281453209481-0839393947896587436598

70%of all externalinputs comethrough the eyes

Software visualization isthe use of the crafts of typography, graphic design, animation, and cinematography with modern human-computer interaction and computer graphics technology to facilitate both the human understanding and effective use of computer software.

Price, Becker, Small

Static Visualization

Dynamic Visualization

no silver bullet

Software is complex

Software is complex

A picture is worth

a thousand words.

UML took it literally

:)

Example: what is ?

Polymetric Views show up to 5 metrics.

Color metric

Width metric

Height metric

Position metrics

Lanza, 2003

A simple & powerful concept

LOC

NOS

lines

parameters

parameters

System Complexity shows class hierarchies.

lines

attributes

methods

Lanza, Ducasse, 2003

Class Blueprint shows class internals.

Initialize Interface Internal Accessor Attribute

invocation and access direction

Lanza, Ducasse, 2005

Class Blueprint has a rich vocabulary.

Regular

Overriding

Extending

Abstract

Constant

Delegating

Setter

Getter

Method

invocations

lines

Attribute

internal access

externalaccess

Access

Invocation

Class Blueprint reveals patterns.

schizophrenic classtwin classes

Distribution Map shows properties over structure. Ducasse etal, 2006

31 parts, 394 elements and 9 properties

Softwarenaut explores the package structure.Lungu etal, 2006

Code City shows where your code lives.Wettel, Lanza, 2007

classes are buildings grouped in quarters of packages

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Jmol - The Time Machine

Software is beautiful

1Software innumbers

2Software in

pictures

3Software in

time

4Software in

tools

3Software in time

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

reve

rse

engi

neer

ing

}

{

}

{

}

{}

{

}

{

A large system contains lots of details.

}

{

}

{

}

{}

{

}

{

}

{

}

{

}

{}

{

}

{

}

{

}

{

}

{}

{

}

{

The history of a large system contains even more details.

}

{

}

{

}

{}

{

}

{

}

{

}

{

}

{}

{

}

{

Lehman etal, 2001

Most often time is put on the horizontaland a property on the vertical axis.

Spectographs show change activity.Wu etal, 2004

commit

time

Evolution Matrix shows changes in classes.

Idle class

Pulsar class

Supernova class

White dwarf class

Lanza, Ducasse, 2002

Evolution Matrix shows changes in classes.Lanza, Ducasse, 2002

History can be measured.

2 4 3 5 7

2 2 3 4 9

2 2 1 2 3

2 2 2 2 2

1 5 3 4 4

What changed? When did it change? ...

1 5 3 4 4

4 2 1 0+++ = 7=

LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-nEvolution ofNumber of Methods

LENOM(C)

1 5 3 4 4

LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n

LENOM(C) 4 2-3 2 2-2 1 2-1 0 20+++ = 1.5=

EENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 22-i

Latest Evolution ofNumber of Methods

Earliest Evolution ofNumber of Methods

EENOM(C) 4 20 2 2-1 1 2-2 0 2-3+++ = 5.25=

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

2 4 3 5 7

2 2 3 4 9

2 2 1 2 3

2 2 2 2 2

1 5 3 4 4

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

balanced changer

late changer

dead stable

early changer

ENOM LENOM EENOM

7 3.5 3.25

7 5.75 1.37

3 1 2

0 0 0

7 1.25 5.25

balanced changer

late changer

dead stable

early changer

History can be measured.

Evolution

Stability

Historical Max

Growth Trend

...

Number of Methods

Number of Lines of Code

Cyclomatic Complexity

Number of Modules

...

of

History can be measured in many ways.

The recently changed parts are likely to change in the near future.

Common wisdom

The recently changed parts are likely to change in the near future.

Common wisdom

Are they really?

30% 90%

present

present

past

present

past future

present

past future

present

past future

present

past future

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

present

past future

YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end)

past.intersectWith(future).notEmpty()

prediction hit

Yesterday’s Weather shows the localization of changed in time. Girba etal, 2004

hit hit hit

YW = 3 / 8 = 37%

hit hit hit hit hit hit hit

YW = 7 / 8 = 87%

A God Class centralizes too much intelligence in the system.

ATFD > FEW

Class uses directly more than a

few attributes of other classes

WMC ! VERY HIGH

Functional complexity of the

class is very high

TCC < ONE THIRD

Class cohesion is low

AND GodClass

Lanza, Marinescu, 2006

A God Class centralizes too much intelligence in the system.

ATFD > FEW

Class uses directly more than a

few attributes of other classes

WMC ! VERY HIGH

Functional complexity of the

class is very high

TCC < ONE THIRD

Class cohesion is low

AND GodClass

Lanza, Marinescu, 2006

But, what if it is

stable?

History-based Detection Strategies take evolution into account. Ratiu etal, 2004

AND

isGodClass(last)

God Class

in the last version

Stability > 90%

Stable throughout

the history

Harmless God Class

What happens with inheritance?

ver .1 ver. 2 ver. 3 ver. 4 ver. 5

A A A A A

B B B B BC C C

D D D E

A is persistent, B is stable, C was removed, E is newborn ...

Hierarchy Evolution encapsulates time.

A

B

D

C

E

age

changedmethods

changedlines

Removed

Removed

Girba etal, 2005

A is persistent, B is stable, C was removed, E is newborn ...

Hierarchy Evolution reveals patterns.Girba etal, 2005

Gall etal, 2003

Co-change analysis recovers hidden dependencies.Time is the lines.

Evolution Radar shows co-change relationships.D’Ambros, Lanza 2006

one package and its co-change relationships

Software is developed by people.

CVS shows activity.

Who is responsible for this?

Who is responsible for this?

Alphabetical order is no order.

Ownership Map reveals development patterns.Girba etal, 2006

JEdit

Ant

(john 23.06.03) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 09.01.05) if (!isActive()) {(bill 09.01.05) return false(bill 09.01.05) }(steve 16.02.05) List offs = i.getOffenders();(john 23.06.03) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(john 23.06.03) boolean res = offs.equals(newOffs);(john 23.06.03) return res;

(george 13.02.05) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 11.13.05) if (!isActive()) {(bill 11.13.05) return false(bill 11.13.05) }(steve 16.02.05) List offs = i.getOffenders();(george 13.02.05) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(george 13.02.05) boolean res = offs.equals(newOffs);(george 13.02.05) return res;

Who copied from whom?

(john 23.06.03) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 09.01.05) if (!isActive()) {(bill 09.01.05) return false(bill 09.01.05) }(steve 16.02.05) List offs = i.getOffenders();(john 23.06.03) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(john 23.06.03) boolean res = offs.equals(newOffs);(john 23.06.03) return res;

(george 13.02.05) public boolean stillValid (ToDoItem I, Designer dsgr) {(bill 11.13.05) if (!isActive()) {(bill 11.13.05) return false(bill 11.13.05) }(steve 16.02.05) List offs = i.getOffenders();(george 13.02.05) Object dm = offs.firstElement();(steve 16.02.05) ListSet newOffs = computeOffenders(dm);(george 13.02.05) boolean res = offs.equals(newOffs);(george 13.02.05) return res;

What is useless?

13.02.05 public boolean stillValid (ToDoItem I, Designer dsgr) {11.13.05 if (!isActive()) {11.13.05 return false11.13.05 }16.02.05 List offs = i.getOffenders();13.02.05 Object dm = offs.firstElement();16.02.05 ListSet newOffs = computeOffenders(dm);13.02.05 boolean res = offs.equals(newOffs);13.02.05 return res;

23.06.03 public boolean stillValid (ToDoItem I, Designer dsgr) {09.01.05 if (!isActive()) {09.01.05 return false09.01.05 }16.02.05 List offs = i.getOffenders();23.06.03 Object dm = offs.firstElement();16.02.05 ListSet newOffs = computeOffenders(dm);23.06.03 boolean res = offs.equals(newOffs);23.06.03 return res;

When did changes happen?

Clone Evolution shows how developers copy.Balint etal, 2006

reve

rse

engin

eerin

gforward engineering

}

{

}

{

}

{

}

{}

{

}

{

}

{}

{

}

{

actual development

reve

rse

engi

neer

ing

1Software innumbers

2Software in

pictures

3Software in

time

4Software in

tools

4Software in tools

http://loose.upt.ro/incode

http://www.inf.unisi.ch/phd/wettel/codecity.html

http://moose.unibe.ch