ICSME14 - On the Impact of Refactoring Operations on Code Quality Metrics

On the Impact of Refactoring Operations on Code Quality

Metrics

ICSME 2014

Victoria, BC, Canada

Oscar Chaparro

Gabriele Bavota

AndrianMarcus

Massimiliano Di Penta

Software refactoring

Any change in the code that improves its internal structure without affecting its external behavior

Refactoring has side effects

Metrics improvement

At the expense of

other metrics

Developers do not know the effect of refactoring in code metrics upfront!

RIPE Refactoring Impact PrEdiction

RIPE includes a set 89 simple, independent and reusable prediction functions

It tells you the specific change in the metrics before applying a specific refactoring

It allows you to decide between refactoring alternatives

RIPE under the hood

𝑀1 𝑀2 𝑀3 … 𝑀11

𝑅𝑂1 𝑓1,1 𝑓1,3 …

𝑅𝑂2 𝑓2,1 … 𝑓2,11

𝑅𝑂3 𝑓3,2 … 𝑓3,11

… … … … … …

𝑅𝑂12 𝑓𝑖,3 … 𝑓12,11

12 Refactoring operations (RO)

11 Code metrics (M)

𝑓𝑀,𝑅𝑂 Code, 𝑅 = 𝑚𝑝

‐ Heuristic-based‐ Defined based on

⁻ Fowler’s definition⁻ Our experience⁻ Study of common cases

‐ Assume refactoring independence

Refactoring operations in RIPE

Refactoring Operation Category

Extract Method (EM)Composing Methods

Inline Method (IM)

Replace Method w. Method Obj. (RMMO)

Pull Up Field (PUF)

Dealing with Generalization

Pull Up Method (PUM)

Push Down Field (PDF)

Push Down Method (PDM)

Replace Delegation with Inheritance (RDI)

Replace Inheritance with Delegation (RID)

Extract Class (EC)Moving Features

Move Field (MF)

Move Method (MM)

Code metrics in RIPE

Code Metric Code Property

Response for a Class (RFC)

CouplingCoupling Between Objects (CBO)

Data Abstraction Coupling (DAC)

Message Passing Coupling (MPC)

Lines of Code (LOC)Size

Number of Methods (NOM)

McCabe’s Cyclomatic Number (CYCLO) Complexity

Lack of Cohesion of Methods 2 (LCOM2)Cohesion

Lack of Cohesion of Methods 5 (LCOM5)

Number of Children (NOC)Inheritance

Depth of Inheritance Tree (DIT)

Predicting how an Extract Class refactoring changes the Coupling Between Objects metric

Example of how RIPE works

Extract Class (EC) refactoringBefore Refactoring After Refactoring

Coupling Between Objects (CBO) metric

CBO(SourceClass) = 5

Before Refactoring

CBO change after EC – Source Class

CBO(SourceClass) = 5

Before Refactoring After Refactoring

CBO(SourceClass) = 5 - 2 + 1 = 4

CBO change after EC – Target Class

After Refactoring

CBO(TargetClass) = 2 + 1 = 3

Before Refactoring

Evaluation goal and process

RQ: What is RIPE’s accuracy in estimating the impact of refactoring operations on code metrics?

Definition: what is accuracy? how is it computed?

Process: how to evaluate accuracy?

Data: code and refactorings

Evaluation metrics

‐ Accuracy: ratio of perfect predictions over all the predictions (level or perfection)

‐ Deviation: gap between the prediction and metric value (level of imperfection)

Accuracy

Deviation

↑

↑

Evaluation process

Measure metrics of

code

Predict metric changes with

RIPE

Apply/Extract refactorings

Measure metrics again

Compare actual metrics vs predictions

List of refactorings

Seeded refactorings

- Goal: Have a uniform distribution of refactorings

- Procedure: 2 PhD. students identified & applied refactorings

- Projects: ArgoUML and aTunes

Existing refactorings

- Goal: Validate the approach in everyday changes

- Procedure: Usage of a tool than mined the versioning logs

- Projects: 13 open source systems

Software projects

Results summary

Dataset AccuracyDeviation

Med Avg

Seeded refactorings 68% 0% 12%

Existing refactorings 22% 14% 41%

Seeded and Existing refactorings

38% 5% 31%

Metric analysis - seeded refactorings

90%

60%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

DIT, NOC, NOM RFC, CBO, LOC

Code metrics

Accuracy

- Coarse granularity- No ambiguity on how

refactorings impact such metrics

- EC and RMMO were difficult to predict

- Specific changes are assumed but not needed

Low deviation: 15% avg and 2% median

Refactoring analysis - seeded refactorings

80%

55%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

PDM, MF, RID RMMO, EC

Refactoring Operations

Accuracy

- There are no many refactoring alternatives

- Why they are not 100% accurate?

- RIPE implementation is conservative

- There are many implementation alternatives

- Deviation is higher:56% avg and 12% median

Low deviation: 5% avg and 0% median

Conclusions

Some metrics are coarse, linear and easy to interpret (e.g., NOM) and others are fine grained and less intuitive (e.g., LCOM5)

RMMO and EC refactorings are difficult to predict as these have many implementation alternatives in practice

RIPE evaluation showed good prediction performance: 38% perfect prediction, low deviation (31% avg & 5% median)

It is possible to predict the specific change of metrics resulted from refactoring through RIPE

Future work

Improve our prediction functions and include more metrics and refactorings

More studies for understanding change on code quality metrics and properties in practice

We will move towards predicting metric changes of composite refactorings and recommendation of these kind of refactorings

On the Impact of Refactoring Operations on Code Quality

Metrics

ICSME 2014

Victoria, BC, Canada

Oscar Chaparro

Gabriele Bavota

AndrianMarcus

Massimiliano Di Penta

RIPE in refactoring decision making

Method 1

Blob Class

Method 2

Method 3

Method 4

…

CsMethod 2

Method 3

…

Cs

Method 1

Method 4

…

Ct

Method 3

…

Cs

Method 1

Method 4

…

Ct

Method 2

Ct has low cohesion

Deviation analysis

‐ There are some metric predictions with high accuracy and high deviation (e.g., NOM or DAC)

‐ What is the meaning of deviation in practice?⁻ Some metrics are coarse, linear and easy to interpret

(e.g., NOM)⁻ Others are fine grained and less intuitive (e.g., LCOM5)

Metric Avg deviation Actual metric deviation

NOM 63% 4

LCOM5 27% 0.104

Only 4 methods are being “mispredicted”

The number of code elements could be high (field accesses from 9 to 30)

ArgoUML results

ICSME14 - On the Impact of Refactoring Operations on Code Quality Metrics

Software

Transcript of ICSME14 - On the Impact of Refactoring Operations on Code Quality Metrics