Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

30
Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement Software Engineering Lab – UFBA Salvador-Bahia-Brazil - les.dcc.ufba.br Software Design and Evolution Group aside.dcc.ufba.br Measurement Bruno C. da Silva Bruno C. da Silva [email protected] [email protected] Cláudio Cláudio Sant’Anna Sant’Anna [email protected] [email protected] Christina Christina Chavez Chavez [email protected] [email protected] Federal University of Bahia (UFBA) Alessandro Garcia Alessandro Garcia [email protected] [email protected] rio.br

Transcript of Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Page 1: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Concern-based Cohesion: Unveiling

a Hidden Dimension of Cohesion

Measurement

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

Software Design and Evolution Group

aside.dcc.ufba.br

Measurement

Bruno C. da SilvaBruno C. da [email protected]@dcc.ufba.br

Cláudio Cláudio Sant’AnnaSant’[email protected]@dcc.ufba.br

Christina Christina [email protected]@dcc.ufba.br

Federal University of Bahia (UFBA)

Alessandro GarciaAlessandro [email protected]@inf.puc--rio.brrio.br

Page 2: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Cohesion can be defined as:

The degree to which a module represents an abstraction of a

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

represents an abstraction of a single concern of the software

2

Page 3: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Structural Cohesion Metrics

Almost all methods share

the same instance

E.g. LCOM, LCOM2, etc.

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 3…

the same instance

variable

Is it a high cohesive

class?

Page 4: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Lack of Concern-Based Cohesion (LCbC)

How many

concerns does this

class address?

http

response

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 4

LCbC = 6

http

response

header

response

buffer

URL enconding

web cookies

Error sending and others…

http redirecting

Is it a high

cohesive class?

Page 5: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Cohesion: Structure-based vs. Concern-based

They capture different dimensions of cohesion• Different source of information and counting

mechanism;

• Different interpretation of cohesion;

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 5

LCOM2 = 0

LCbC = 6

Example – ResponseFacade (Tomcat)

low lack of cohesion

or

high cohesion

high lack of cohesion

or

low cohesion

Page 6: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Empirical Study – First Goal

Provide empirical evidence about

whether the concern-driven nature of a

cohesion metric makes it significantly

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 6

cohesion metric makes it significantly

different from structural cohesion

metrics.

Page 7: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Moreover…

http response

buffer

http response

header

URL enconding

changes

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

… the number of concerns a module realizes may influence

positively the number of changes it may be subject to.

7

web cookies

http redirecting

Error sending

Page 8: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Empirical Study – Second Goal

Investigate whether and how concern-

based cohesion is associated to

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 8

based cohesion is associated to

change-proneness.

Page 9: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Research Questions

RQ1: Does LCbC capture a dimension of module cohesion that is not captured by structural cohesion metrics?

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

by structural cohesion metrics?

9

Page 10: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Research Questions

RQ2: How strong is the correlation between LCbC and module change-proneness?

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

proneness?

10

Page 11: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Research Questions

RQ3: Does the LCbC metric applied together with structural cohesion

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

together with structural cohesion metrics enhance the prediction of module changes?

11

Page 12: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Empirical Study Settings

Change history

Module1 - - - - -

Module 2 - - - -

Module 3 - - - - - - -

Module n - - - -

System

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 12

Module n - - - -

LCOM2, LCOM3,

LCOM4, LCOM5,

TCC, LCbC

Change Count (CC)

Page 13: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Change history

Module1 - - - - -

Module 2 - - - -

Module 3 - - - - - - -

Module n - - - -

System

LCOM2, LCOM3,

LCOM4, LCOM5,

TCC, LCbC

Change Count (CC)

System Revisions analyzed

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 13

System Revisions analyzedJFreeChart 2,272

Freecol 3,426

jEdit 2,916

Tomcat 3,157

Findbugs 3,765

Rhino 777

Total 16,313

Page 14: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Empirical Study Settings

LCbC needs a concern-to-code mapping

concern A concern B concern C

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 14…

concern A concern B concern C

Page 15: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

SystemJFreeChart

Freecol

jEdit

Concerns automatically

mapped using the XScan tool

Empirical Study Settings

Concern-to-code mapping procedure

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

jEdit

Tomcat

Findbugs

Rhino

mapped using the XScan tool

Manual concern mapping

provided by Eaddy et al (2008)

15

Page 16: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4 PC1 PC2 PC3 PC4

LCOM2 0.94 0.14 - 0.11 0.11 0.96 0.04 0.11 0.07 0.06 0.98 0.08 0.04 0.09 0.08 0.25 0.96 0.12 0.04 0.14 0.98 - 0.12 0.72 0.41 0.34

FreecolJFreeChart Rhino jEdit Tomcat Findbugs

RQ1: Does LCbC capture a dimension of module

cohesion that is not captured by structural cohesion

metrics?

Principal Component Analysis (PCA)

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCOM2 0.94 0.14 - 0.11 0.11 0.96 0.04 0.11 0.07 0.06 0.98 0.08 0.04 0.09 0.08 0.25 0.96 0.12 0.04 0.14 0.98 - 0.12 0.72 0.41 0.34

LCOM3 0.02 0.72 - 0.43 0.37 0.23 0.72 0.43 0.24 0.90 0.12 0.17 0.07 0.89 0.15 0.12 0.08 0.90 0.12 0.10 0.12 0.64 0.20 0.53 0.15

LCOM4 0.87 0.03 - 0.04 0.37 0.94 0.19 0.07 0.09 0.18 0.09 0.97 0.08 0.11 0.09 0.95 0.25 0.12 0.00 0.98 0.14 0.16 0.09 0.03 0.96

LCOM5 0.14 0.94 - 0.12 - 0.04 0.11 0.21 0.94 0.19 0.87 0.12 - 0.03 0.06 0.88 0.04 - 0.06 0.13 0.88 0.06 - 0.02 0.07 0.27 - 0.06 0.89 0.01

TCC - 0.12 - 0.24 0.95 - 0.09 - 0.08 - 0.95 - 0.08 - 0.02 - 0.79 0.12 - 0.21 0.06 - 0.85 - 0.01 - 0.16 0.03 - 0.80 0.06 - 0.14 - 0.04 - 0.89 - 0.05 - 0.16 - 0.11

LCbC 0.51 0.10 - 0.13 0.81 0.11 0.11 0.19 0.97 0.04 0.04 0.08 0.99 0.10 0.99 0.08 0.07 0.05 0.99 0.00 0.04 0.22 0.89 - 0.18 - 0.03

LCbC was the major metric of at least one PC in all systems. And in most of the systems it contributed exclusively for a PC

16

Page 17: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

RQ2: How strong is the correlation between LCbC and

module change-proneness?

Spearman Correlation: each cohesion metric vs CC

JFreeChart Rhino jEdit Tomcat Findbugs Freecol

LCOM2 0.48 0.69 0.16 0.33 0.48 0.49

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

In jEdit and Findbugs LCbC did not perform well

17

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

LCOM4 0.32 0.46 0.10 0.21 0.23 0.20

LCOM5 0.15 0.30 0.18 0.23 0.34 0.22

TCC 0.24 0.22 0.13 0.16 0.06* 0.30

LCbC 0.66 0.62 0.15 0.35 0.21 0.46

* no signicance level

Page 18: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

RQ2: How strong is the correlation between LCbC and

module change-proneness?

Spearman Correlation: each cohesion metric vs CC

JFreeChart Rhino jEdit Tomcat Findbugs Freecol

LCOM2 0.48 0.69 0.16 0.33 0.48 0.49

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCbC and LCOM2 were the most correlated with

change count

18

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

LCOM4 0.32 0.46 0.10 0.21 0.23 0.20

LCOM5 0.15 0.30 0.18 0.23 0.34 0.22

TCC 0.24 0.22 0.13 0.16 0.06* 0.30

LCbC 0.66 0.62 0.15 0.35 0.21 0.46

* no signicance level

Page 19: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

JFreeChart Rhino jEdit Tomcat Findbugs Freecol

LCOM2 0.48 0.69 0.16 0.33 0.48 0.49

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

RQ2: How strong is the correlation between LCbC and

module change-proneness?

Spearman Correlation: each cohesion metric vs CC

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

LCOM4 0.32 0.46 0.10 0.21 0.23 0.20

LCOM5 0.15 0.30 0.18 0.23 0.34 0.22

TCC 0.24 0.22 0.13 0.16 0.06* 0.30

LCbC 0.66 0.62 0.15 0.35 0.21 0.46

* no signicance level

In Rhino and Freecol, LCbC was the second most correlated (strong and moderate, respectively) preceded by LCOM2.

19

Page 20: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

JFreeChart Rhino jEdit Tomcat Findbugs Freecol

LCOM2 0.48 0.69 0.16 0.33 0.48 0.49

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

RQ2: How strong is the correlation between LCbC and

module change-proneness?

Spearman Correlation: each cohesion metric vs CC

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCOM3 0.34 0.48 0.17 0.27 0.38 0.19

LCOM4 0.32 0.46 0.10 0.21 0.23 0.20

LCOM5 0.15 0.30 0.18 0.23 0.34 0.22

TCC 0.24 0.22 0.13 0.16 0.06* 0.30

LCbC 0.66 0.62 0.15 0.35 0.21 0.46

* no signicance level

LCbC was the most correlated with change count in JFreeChart (strong correlation) and Tomcat (moderate correlation).

20

Page 21: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

RQ3: Does the LCbC metric applied together with

structural cohesion metrics enhance the prediction of

module changes?

Linear Regression AnalysisR

2 (adj)

JFreeChart 0.63

Rhino 0.59

(0.47)LCOM2 + (0.11)LCOM3 + (0.59)LCbC + (-0.27)LCOM4

(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC

Metrics in the Final Model with Standardized Coefficients

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCbC ended up in four regression models

21

Rhino 0.59

Findbugs 0.37

Freecol 0.35

Tomcat 0.32

jEdit 0.26

* no signicance level

(0.20)LCOM2 + (0.35)LCOM4 + (0.09*)LCOM5 + (0.17)LCbC

(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC

(0.45)LCOM2 + (0.20)LCOM3 + (0.17)LCOM4

(0.44)LCOM2 + (0.21)LCOM3 + (0.11)LCbC

(0.39)LCOM2 + (0.16)LCOM3 + (0.29)LCbC + (-0.07*)LCOM4

Page 22: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

RQ3: Does the LCbC metric applied together with

structural cohesion metrics enhance the prediction of

module changes?

Linear Regression AnalysisR

2 (adj)

JFreeChart 0.63

Rhino 0.59

(0.47)LCOM2 + (0.11)LCOM3 + (0.59)LCbC + (-0.27)LCOM4

(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC

Metrics in the Final Model with Standardized Coefficients

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

LCbC was the most important metric for the JFreeChartregression model

22

Rhino 0.59

Findbugs 0.37

Freecol 0.35

Tomcat 0.32

jEdit 0.26

* no signicance level

(0.20)LCOM2 + (0.35)LCOM4 + (0.09*)LCOM5 + (0.17)LCbC

(0.63)LCOM2 + (0.37)LCOM3 + (0.18*)TCC

(0.45)LCOM2 + (0.20)LCOM3 + (0.17)LCOM4

(0.44)LCOM2 + (0.21)LCOM3 + (0.11)LCbC

(0.39)LCOM2 + (0.16)LCOM3 + (0.29)LCbC + (-0.07*)LCOM4

Page 23: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Examples that illustrate the differences on

the dimensions of cohesion captured by

LCbC and structural cohesion metrics

Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)

ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 23

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

Page 24: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Examples that illustrate the differences on

the dimensions of cohesion captured by

LCbC and structural cohesion metrics

Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)

ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 24

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

Facade class usually has methods related to different

concerns because it serves as entrance point for

different functionalities.

Page 25: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)

ResponseFacade (Tomcat) 10 (top 2%) 0 5 (top 20%)

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

Examples that illustrate the differences on

the dimensions of cohesion captured by

LCbC and structural cohesion metrics

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

CombinedRangeXYPlot (JFreeChart) 11 (top 5%) 33 (top 35%) 11 (top 10%)

25

Concerns related to: drawing, zooming, axis space, click

handling and plotting.

Page 26: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

When concern-based cohesion fails in the

association with changes

Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)

When the concern-to-code mapping fails to

identify concerns!

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br 26

Class (System) LCbC (Rank) LCOM2 (Rank) CC (Rank)

jEdit (jEdit) 0 9351 (3rd) 24 (2nd)

JEditBuffer (jEdit) 0 5913 (4th) 17 (5th)

SortedBugCollection (Findbugs) 0 1889 (5th) 76 (4th)

Page 27: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Threats to Validity

Quality of concern-to-code mapping

Underlying tool for concern mapping

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

Change Count

27

Page 28: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Conclusions

LCbC defined itself a new and orthogonal dimension of module cohesion in the studied systems.

LCbC performed well in the association with change-proneness in most of the systems.

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

proneness in most of the systems.

Concern-based cohesion has provided indications that

it is worth to be further investigated.

28

Page 29: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Future Work

�How LCbC performs in comparison with topic-based cohesion metrics such as C3 and MWE

�The association between LCbC and fault-proneness

�Whether or not the type of class would be an

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

�Whether or not the type of class would be an interesting factor to be considered

�The application of different regression analysis techniques

�Search for more complete concern mappings

29

Page 30: Concern-based Cohesion: Unveiling a Hidden Dimension of Cohesion Measurement

Concern-based Cohesion: Unveiling

a Hidden Dimension of Cohesion

Measurement

Software Engineering Lab – UFBA

Salvador-Bahia-Brazil - les.dcc.ufba.br

Software Design and Evolution Group

aside.dcc.ufba.br

Measurement

Bruno C. da SilvaBruno C. da [email protected]@dcc.ufba.br

Cláudio Cláudio Sant’AnnaSant’[email protected]@dcc.ufba.br

Christina Christina [email protected]@dcc.ufba.br

Federal University of Bahia (UFBA)

Alessandro GarciaAlessandro [email protected]@inf.puc--rio.brrio.br