Devnology back toschool software reengineering

70
Devnology Lecture Software Reengineering Martin Pinzger Software Engineering Research Group Delft University of Technology

description

Lecture for Devnology's Back to School program about software re-engineering using visualization techniques

Transcript of Devnology back toschool software reengineering

Page 1: Devnology back toschool software reengineering

Devnology LectureSoftware Reengineering

Martin PinzgerSoftware Engineering Research GroupDelft University of Technology

Page 2: Devnology back toschool software reengineering

Greenfield software development

2

Page 3: Devnology back toschool software reengineering

Non-greenfield software development

3

?

Page 4: Devnology back toschool software reengineering

How often did you ...

4

... encounter greenfield and non-greenfield software engineering?

Page 5: Devnology back toschool software reengineering

Because existing software, often called legacy software, is valuable

Often business-critical

A huge amount of money has already been invested in it

Has been tested and runs

Does (mainly) what it should do

Would you replace such a system?

Why non-greenfield engineering?

5

Page 6: Devnology back toschool software reengineering

Why do we (often) start from a mess?

6

Page 7: Devnology back toschool software reengineering

Lehman’s Laws of software evolution

Continuing changeA program that is used in a real-world environment must change, or become progressively less useful in that environment.

Increasing complexityAs a program evolves, it becomes more complex, and extra resources are needed to preserve and simplify its structure.

For more information read Lehman and Belady, 1985

7

Page 8: Devnology back toschool software reengineering

Evolution of Mozilla source code

8

Page 9: Devnology back toschool software reengineering

Lehman’s Laws in practice

Existing software Is often modified in an ad-hoc manner (quick fixes)

Lack of time, resources, money, etc.

Initial good design is not maintainedSpaghetti code, copy/paste programming, dependencies are introduced, no tests, etc.

Documentation is not updated (if there is one)Architecture and design documents

Original developers leave and with them their knowledge

9

Page 10: Devnology back toschool software reengineering

Typical result of such practices

10

Page 11: Devnology back toschool software reengineering

Implications of the results

Software maintenance costs continuously increase

Between 50% and 75% of global software development costs are spent on maintenance!

Up to 60% of a maintenance effort is spent on understanding the existing software

11

Page 12: Devnology back toschool software reengineering

What is your decision?

12

* duplicated code* complex conditionals* abusive inheritance* large classes/methods

According to Lehman: “there will always be changes”

hack it?

* first reengineer* then implement changes

Take a loan on your softwarepay back via reengineering

Investment for the futurepaid back during maintenance

Page 13: Devnology back toschool software reengineering

Lecture Outline

13

Introduction

Reengineering Process

Module Capture/Reverse Engineering

Problem Detection

Summary and Conclusions

Page 14: Devnology back toschool software reengineering

Let’s reengineer

Definition:

“Reengineering is the examination and alteration of a subject system to reconstitute it in a new form and the subsequent implementation of the new form.”

[Demeyer, Ducasse, Nierstrasz]http://scg.unibe.ch/download/oorp/

14

Page 15: Devnology back toschool software reengineering

Reengineering Life-Cycle

15

(1) requirementanalysis

(2) modelcapture

(3) problemdetection (4) problem

resolution

New Requirements

Designs

Code

Page 16: Devnology back toschool software reengineering

Goals of reengineering

16

Testability

Understandability

Modifiability

Extensibility

Maintainability

Page 17: Devnology back toschool software reengineering

Goals of reengineering (2)

UnbundlingSplit a monolithic system into parts that can be separately marketed

Performance“First do it, then do it right, then do it fast”

Design extractionTo improve maintainability, portability, etc.

Exploitation of New TechnologyI.e., new language features, standards, libraries, etc.

17

Page 18: Devnology back toschool software reengineering

In this course, you will learn

Best practices to analyze and understand software systems (i.e., reverse engineering)

Heuristics and tools to detect shortcomings in the design and implementation of software systems

18

Page 19: Devnology back toschool software reengineering

Setting DirectionFirst ContactInitial UnderstandingDetailed Model Capture

Model CaptureReverse Engineering

Page 20: Devnology back toschool software reengineering

Setting direction patterns

20

Agree on Maxims

Set direction

Appoint aNavigator

Speak to theRound Table

Maintaindirection

Coordinatedirection

Most Valuable First

Where to start?

Fix Problems,Not Symptoms

If It Ain't BrokeDon't Fix It

What not to do?What to do?

Keep it Simple

How to do it?

Principles & guidelines forsoftware project management areespecially relevant for reengineering projects

Page 21: Devnology back toschool software reengineering

Pattern: Most Valuable First

Problem: Which problems should you address first?

21

Page 22: Devnology back toschool software reengineering

Most valuable first (2)

Solution: Work on aspects that are most valuable to your customer

Maximize commitment

Deliver results early

Build confidence

22

Page 23: Devnology back toschool software reengineering

Most valuable first (3)

23

Page 24: Devnology back toschool software reengineering

Most valuable first (4)

24

How do you tell what is valuable?Identify your customer

Understand the customer’s business model

Determine measurable goals

Consult change logs for high activity

Play the Planning Game

Fix Problems, not Symptoms

Page 25: Devnology back toschool software reengineering

Most valuable first (5)

Planning Game

25

Page 26: Devnology back toschool software reengineering

Setting DirectionFirst ContactInitial UnderstandingDetailed Model Capture

Model CaptureReverse Engineering

Page 27: Devnology back toschool software reengineering

27

What is Reverse Engineering and why?

Reverse Engineering is the process of analyzing a subject systemto identify the system’s components and their interrelationships and

create representations of the system in another form or at a higher level of abstraction [Chikofsky & Cross, ’90]

MotivationUnderstanding other people’s code, the design and architecture in order to maintain and evolve a software system

Page 28: Devnology back toschool software reengineering

First contact patterns

28

System experts

Chat with theMaintainers

Interviewduring Demo

Talk withdevelopers

Talk withend users

Talk about it

Verify whatyou hear

feasibility assessment(one week time)

Software System

Read All the Codein One Hour

Do a MockInstallation

Read it Compile it

Skim theDocumentation

Read about it

Page 29: Devnology back toschool software reengineering

Pattern: Read all the code in one hour

Problem: Yes, but… the system is so big! Where to start?

29

Page 30: Devnology back toschool software reengineering

Read all the code in one hour (2)

Solution: Read the code in one hour

Focus on:Functional tests and unit tests

Abstract classes and methods and classes high in the hierarchy

Surprisingly large structures

Comments

Check classes with high fan-out

Study the build process

30

Page 31: Devnology back toschool software reengineering

In Java programs focus on

31

public abstract class Example {...}

public interface IExample {...}

public class Test { ... @Test public void testExample() { ... }}

/** * Block comment */public class Example { public void foo() { int x = 1; for (int x=1; i<100; i++) { // do something comment } }}

Page 32: Devnology back toschool software reengineering

First project plan

Project scope (1/2 page)Description, context, goals, verification criteria

OpportunitiesIdentify factors to achieve project goals

Skilled maintainers, readable source-code, documentation, etc.

RisksIdentify risks that may cause problems

Absent test-suites, missing libraries, etc.

Record likelihood & impact for each risk

Go/no-go decision, activities (fish-eye view)

32

Page 33: Devnology back toschool software reengineering

Setting DirectionFirst ContactInitial UnderstandingDetailed Model Capture

Model CaptureReverse Engineering

Page 34: Devnology back toschool software reengineering

34

Initial understanding patterns

Top down

Speculate about Design

Analyze the Persistent Data

Study the Exceptional Entities

understand ⇒higher-level model

Bottom up

ITERATION

Recover design

Recover database

Identify problems

Page 35: Devnology back toschool software reengineering

35

Study the exceptional entities

Problem: How can you quickly identify design problems?

Solution: Measure software entities and study the anomalous onesVisualize metrics to get an overview

Use simple metricsLines of code

Number of methods

...

Page 36: Devnology back toschool software reengineering

Use simple metrics and layout algorithms.

(x,y) width

height colour

Visualize up to 5 metrics per node

Example: Exceptional entities

Use simple metrics and layout algorithms

36

Page 37: Devnology back toschool software reengineering

Setting DirectionFirst ContactInitial UnderstandingDetailed Model Capture

Model CaptureReverse Engineering

Page 38: Devnology back toschool software reengineering

38

Detailed model capture patterns

Expose the design & make sure it stays exposed

Tie Code and Questions

Refactor to Understand

Keep track ofyour understanding

Expose design

Step through the Execution

Expose collaborations

• Use Your Tools• Look for Key Methods• Look for Constructor Calls• Look for Template/Hook Methods• Look for Super Calls

Look for the Contracts

Expose contracts

Learn from the Past

Expose evolution

Write Teststo Understand

Page 39: Devnology back toschool software reengineering

39

Refactor to understand

Problem: How do you decipher cryptic code?

Solution: Refactor it till it makes senseGoal (for now) is to understand, not to reengineer

HintsWork with a copy of the code

Refactoring requires an adequate test baseIf this is missing, “Write Tests to Understand”

Page 40: Devnology back toschool software reengineering

40

Refactor to understand (cont.)

GuidelinesRename attributes to convey roles

Rename methods and classes to reveal intent

Remove duplicated code

Replace condition branches by methods

Page 41: Devnology back toschool software reengineering

41

Learn from the past

Problem: How did the system get the way it is? Which parts are stable and which aren’t?

Solution: Compare versions to discover where code was removedRemoved functionality is a sign of design evolution

Use or develop appropriate tools

Look for signs of:Unstable design — repeated growth and refactoring

Mature design — growth, refactoring, and stability

Page 42: Devnology back toschool software reengineering

42

Examples: Unstable design

Pulsar: Repeated Modifications make it grow and shrink. System Hotspot: Every System Version requires changes.

Page 43: Devnology back toschool software reengineering

Reverse engineering tools/prototypes

X-Rayhttp://xray.inf.usi.ch/xray.php

DA4Javahttp://swerl.tudelft.nl/bin/view/MartinPinzger/MartinPinzgerDA4Java

Code Cityhttp://www.inf.usi.ch/phd/wettel/codecity.html

43

Page 44: Devnology back toschool software reengineering

X-Ray

44

Page 45: Devnology back toschool software reengineering

DA4Java

45

Page 46: Devnology back toschool software reengineering

CodeCity

46

Page 47: Devnology back toschool software reengineering

Summary Model Capture

47

Setting direction patterns toSet the goals

Find the Go/No-Go decision

Increase commitment of clients and developers

First contact patterns toObtain an overview and grasp the main issues

Assess the feasibility of the project

Initial Understanding & Detailed Model CapturePlan the work … and work the plan

Frequent and short iterations

Page 48: Devnology back toschool software reengineering

In the Source CodeIn the Evolution

Problem Detection

Page 49: Devnology back toschool software reengineering

49

Design problems

The most common design problems result from code that is

Unclear & complicated Duplicated (code clones)

Page 50: Devnology back toschool software reengineering

50

Code Smells (if it stinks, change it)

Duplicated CodeLong MethodLarge ClassLong Parameter ListDivergent ChangeShotgun SurgeryFeature Envy...

A code smell is a hint that something has gone wrong somewhere in your code.

Page 51: Devnology back toschool software reengineering

51

How to detect?

Measure and visualize quality aspects of the current implementation of a system

Source code metrics and structures

Measure and visualize quality aspects of the evolution of a system

Evolution metrics and structures

Use Polymetric Views

Page 52: Devnology back toschool software reengineering

52

Polymetric Views

A combination of metrics and software visualization

Visualize software using colored rectangles for the entities and edges for the relationships

Render up to five metrics on one node:

Size (1+2)

Color (3)

Position (4+5)

7

Relationship

Entity

Y Coordinate

Height Color tone

Width

X Coordinate

Page 53: Devnology back toschool software reengineering

53

Smell 1: Long Method

The longer a method is, the more difficult it is to understand it.

When is a method too long?Heuristic: > 10 LOCs (?)

How to detect?Visualize LOC metric values of methods

“Method Length Distribution View”

Page 54: Devnology back toschool software reengineering

54

Method Length Distribution

Metrics:Boxes: MethodsWidth: LOCPosition-Y: LOCSort: LOC

Page 55: Devnology back toschool software reengineering

55

Smell 2: Switch Statement

Problem is similar to code duplicationSwitch statement is scattered in different places

How to detect?Visualize McCabe Cyclomatic Complexity metric to detect complex methods

“Method Complexity Distribution View”

Page 56: Devnology back toschool software reengineering

56

Method Complexity

Metrics:Boxes: MethodsPosition-X: LOCPosition-Y: MCCSort: -

Page 57: Devnology back toschool software reengineering

More info on Detection Strategies

Object-Oriented Metrics in PracticeMichele Lanza and Radu Marinescu, Springer 2006http://www.springer.com/computer/swe/book/978-3-540-24429-5

57

Page 59: Devnology back toschool software reengineering

In the Source CodeIn the Evolution

Problem Detection

Page 60: Devnology back toschool software reengineering

60

Understanding Evolution

Changes can point to design problems“Evolutionary Smells”

ButOverwhelming complexity

How can we detect and understand changes?

SolutionsThe Evolution Matrix

The Kiviat Graphs

Page 61: Devnology back toschool software reengineering

61

Visualizing Class Evolution

Visualize classes as rectangles using for width and height the following metrics:

NOM (number of methods)

NOA (number of attributes)

The Classes can be categorized according to their “personal evolution” and to their “system evolution”

-> Evolution Patterns

Foo

Bar

Page 62: Devnology back toschool software reengineering

First Version

Major Leap

TIME (Versions)Growth Stabilisation

Added Classes

62

The Evolution Matrix

Last VersionRemoved Classes

Page 63: Devnology back toschool software reengineering

63

Evolution Patterns & Smells

Day-fly (Dead Code)

Persistent

Pulsar (Change Prone Entity)

SupernovaWhite Dwarf (Dead Code)

Red Giant (Large/God Class)

Idle (Dead Code)

Page 64: Devnology back toschool software reengineering

64

Persistent / Dayfly

Persistent: Has the same lifespan as the whole system. Part of the original design. Perhaps holy dead code which no one dares to remove.

Dayflies: Exists during only one or two versions. Perhaps an idea which was tried out and then dropped.

Page 65: Devnology back toschool software reengineering

65

Pulsar / Supernova

Pulsar: Repeated Modifications make it grow and shrink. System Hotspot: Every System Version requires changes.

Supernova: Sudden increase in size. Possible Reasons:• Massive shift of functionality towards a class.• Data holder class for which it is easy to grow.• Sleeper: Developers knew exactly what to fill in.

Page 66: Devnology back toschool software reengineering

66

White Dwarf / Red Giant / Idle

White Dwarf: Lost the functionality it had and now trundles along without real meaning. Possibly dead code -> Lazy Class.

Red Giant: A permanent god (large) class which is always very large.

Idle: Keeps size over several versions. Possibly dead code,possibly good code.

Page 67: Devnology back toschool software reengineering

Summary Problem Detection

Design ProblemsResult from duplicated, unclear, complicated source code -> Code Smells

Detection heuristics and Polymetric Views to detect code and evolution smells

67

Page 68: Devnology back toschool software reengineering

Conclusions

Object-Oriented Re-engineering PatternsSet of best practices to re-engineering software systems

Module Capture and Reverse EngineeringUnderstand the design and implementation of software systems

Problem DetectionHeuristics to detect Bad Smells in the source code and evolution of software systems

Next StepAdd tests and refactor detected problems

68

Page 69: Devnology back toschool software reengineering

Reading material

69

Object-Oriented Reengineering PatternsSerge Demeyer, Stephane Ducasse, and Oscar Nierstraszfree copy from: http://scg.unibe.ch/download/oorp/

Working Effectively with Legacy Code Michael Feathers, Prentice Hall, 1 edition, 2004

Refactoring to PatternsJoshua Kerievsky, Addison-Wesley Professional, 2004

Page 70: Devnology back toschool software reengineering

Additional reading

70

Agile Software Development: Principles Patterns, and PracticesRobert C. Martin, Prentice Hall

Object-Oriented Design Heuristics Arthur J. Riel, Prentice Hall, 1 edition, 1996

Refactoring: Improving the Design of Existing CodeMartin Fowler, Addison-Wesley Professional, 1999