A Systematic Mapping Study on Software Engineering Testbeds Emanoel Barreiros Advisor: Sérgio...

68
A Systematic Mapping Study on Software Engineering Testbeds Emanoel Barreiros Advisor: Sérgio Soares

Transcript of A Systematic Mapping Study on Software Engineering Testbeds Emanoel Barreiros Advisor: Sérgio...

A Systematic Mapping Study on Software Engineering TestbedsA Systematic Mapping Study on Software Engineering Testbeds

Emanoel Barreiros

Advisor: Sérgio Soares

System SolutionSystem Solution

Software+

Hardware+

Documentation (run, transform...)

Software+

Hardware+

Documentation (run, transform...)

2

SoftwareSoftware

Developed ≠ ProducedDeveloped ≠ Produced

ManufacturingHuman Creativity

ManufacturingHuman Creativity

3

Software EngineeringSoftware Engineering

Software runs on the real worldSoftware runs on the real world

Empirical StudiesEmpirical Studies

better evaluatebetter evaluate

predictpredictunderstandunderstand

controlcontrol

improveimprove 4

What do we need?What do we need?

GuidelinesExperience reports

ReplicationsComparisons

Systematic evaluation

GuidelinesExperience reports

ReplicationsComparisons

Systematic evaluation5

TestbedsTestbeds

Systematic evaluation of technologies

Systematic evaluation of technologies

Holibaugh et al. (1988) “Phase I Testbed Description: Requirements and Selection

Guidelines”

Holibaugh et al. (1988) “Phase I Testbed Description: Requirements and Selection

Guidelines”6

??Research ProblemResearch Problem

How can testbeds help researchers in evaluating technologies in SE? What

should be considered in order to build a new testbed in SE?

How can testbeds help researchers in evaluating technologies in SE? What

should be considered in order to build a new testbed in SE?

7

ObjectivesObjectives

Systematic mapping study

Facilitate creation of testbeds in SE

Foster empirical studies

Systematic mapping study

Facilitate creation of testbeds in SE

Foster empirical studies

8

BACKGROUNDBACKGROUND

Experimentation in SE, Testbeds in SE and Evidence-Based Software Engineering

Empirical Studies in SEEmpirical Studies in SE

10

How?How?

Near real scenarios

Risky!

Near real scenarios

Risky!

11

Empirical StudiesEmpirical Studies

Systematic

Disciplined

Quantifiable

Controlled

Systematic

Disciplined

Quantifiable

Controlled12

DifficultiesDifficulties

Definition and encapsulationControl of variablesHuman involvement

High cost

Definition and encapsulationControl of variablesHuman involvement

High cost13

One Experimental FrameworkOne Experimental Framework

1. Definition: motivation, object, purpose, perspective, domain, scope;

2. Planning: design, criteria, measurement;3. Operation: preparation, execution, analysis;4. Interpretation: context, extrapolation,

impact;5. Packaging

1. Definition: motivation, object, purpose, perspective, domain, scope;

2. Planning: design, criteria, measurement;3. Operation: preparation, execution, analysis;4. Interpretation: context, extrapolation,

impact;5. Packaging

Basili et al. (1986) “Experimentation in software engineering”14

Why Testbeds in SE?Why Testbeds in SE?

Maturation may take 15+ years

Systematic studies Replicable, example

Lower risks

Easier knowledge transfer

Maturation may take 15+ years

Systematic studies Replicable, example

Lower risks

Easier knowledge transfer

15

SE Testbeds Usually HaveSE Testbeds Usually Have

• Assets repository

• Guidelines

• Mestrics suite

• Knowledge base

• Applications

• Assets repository

• Guidelines

• Mestrics suite

• Knowledge base

• Applications

16

SE Testbed ExampleSE Testbed Example

• Mikael Lidvall et al. (2007) “Experimenting with software testbeds for evaluating new technologies”– TSAFE tool– Soft. Architecture Evaluation– Seeded defects– Requirements specs., architecture docs., source

code, installation guide, flight data

17

Evidence-Based Software Engineering (EBSE)

Evidence-Based Software Engineering (EBSE)

Medical sciences

Stand on the shoulder of giants

SLRs and SMSs

Medical sciences

Stand on the shoulder of giants

SLRs and SMSs

18

A

”A

”““

EBSE GoalEBSE Goal

To provide the means by which current best evidence from research can be integrated with practical experience and human values in the

decision making process regarding the development and maintenance of software.

To provide the means by which current best evidence from research can be integrated with practical experience and human values in the

decision making process regarding the development and maintenance of software.

Kitchenham et al. (2004). “Evidence-based software engineering”

19

EBSE AdvantagesEBSE Advantages

Reduced bias

Wide range conclusions

Quantitative studies can detect real effects

Reduced bias

Wide range conclusions

Quantitative studies can detect real effects

20

SMSs ≠ SLRsSMSs ≠ SLRs

• SMSs have broader questions• Search terms are less focused • Data extraction is broader• Summaries rather than deep analyses• More limited than SLRs

• SMSs have broader questions• Search terms are less focused • Data extraction is broader• Summaries rather than deep analyses• More limited than SLRs

21

METHODOLOGYMETHODOLOGYClassification, Research Steps and Protocol

MethodologyMethodology

23

Classification According to CooperClassification According to Cooper

24

Harris M. Cooper (1988). “Organizing Knowledge Syntheses: A Taxonomy of Literature Reviews”

Research StepsResearch Steps

25

Research Steps

Research problemIdentify need for

an SMS

Define research questions

Develop protocol

Review protocol

Primary studies

selectionData extraction

Report writing

Dissemination

Selection/Evaluation ProcessSelection/Evaluation Process

Evaluation process

Set of Studies (not evaluated)

T1 - results

Tn

T1

Tn - results

AgreementDisagreement Table

Set of Studies (evaluated)

Conflict Resolution

meeting

...

26

ScopeScope

• Population: Published literature that report the definition of software engineering testbeds;

• Intervention: Empirical studies that defines a SE testbed;

• Comparison: N/A;• Outcomes: Quantity and type of evidence regarding

the empirical evaluation of software engineering technologies;

• Context: Empirical studies on SE are run mainly in academic environments, which characterizes the current mapping study as a being of academic context.

27

Research QuestionsResearch Questions

• RQ1: Which specific software engineering fields are investigated using testbeds?

• RQ2: Which elements the studies define for a testbed in software engineering?

• RQ3: Which are the benefits of having a software engineering testbed?

• RQ4: Which methods are used to evaluate the proposed testbed?

• RQ1: Which specific software engineering fields are investigated using testbeds?

• RQ2: Which elements the studies define for a testbed in software engineering?

• RQ3: Which are the benefits of having a software engineering testbed?

• RQ4: Which methods are used to evaluate the proposed testbed?

28

Automated SearchAutomated Search

• IEEE Computer Society Digital Library• ACM Digital Library• Scopus• Citeseer• Springer Link

• IEEE Computer Society Digital Library• ACM Digital Library• Scopus• Citeseer• Springer Link

29

Manual Search (past 12 months)Manual Search (past 12 months)

• International Conference on Software Engineering (ICSE), 2010

• Empirical Software Engineering and Measurement (ESEM), 2009

• Evaluation and Assessment in Software Engineering (EASE), 2010

• Empirical Software Engineering Journal (June 2009 to June 2010)

• International Conference on Software Engineering (ICSE), 2010

• Empirical Software Engineering and Measurement (ESEM), 2009

• Evaluation and Assessment in Software Engineering (EASE), 2010

• Empirical Software Engineering Journal (June 2009 to June 2010)

30

Search StringSearch String

“software engineering” AND (testbed OR “family of experiments” OR “technology evaluation” OR “technology

comparison” OR “technology transition” OR “software adoption” OR “software maturity” OR “replicability” OR “replicate” OR

“replication” OR “experiment protocol” OR “study protocol” OR “experiment process” OR “experiment guidelines” OR “study

process” OR comparability OR comparable OR compare)

“software engineering” AND (testbed OR “family of experiments” OR “technology evaluation” OR “technology

comparison” OR “technology transition” OR “software adoption” OR “software maturity” OR “replicability” OR “replicate” OR

“replication” OR “experiment protocol” OR “study protocol” OR “experiment process” OR “experiment guidelines” OR “study

process” OR comparability OR comparable OR compare)

31

Selection Criteria – Phase 1Selection Criteria – Phase 1

Title, keywords and abstracts

2 researchers for each paper

Remove out of scope papers

Title, keywords and abstracts

2 researchers for each paper

Remove out of scope papers

32

Selection Criteria – Phase 2Selection Criteria – Phase 2

Full paper

Real

Not a duplicate

Defines a testbed

Related to SE

Full paper

Real

Not a duplicate

Defines a testbed

Related to SE 33

Data Extraction (Primary Studies)Data Extraction (Primary Studies)

Form A• ID• Source• Year• Title• Author• Institution• Country

Form C• Evaluation Date• Researcher• Research Question 1• Research Question 2• Research Question 3• Research Question 4• Notes

34

Data Extraction (Excluded Studies)Data Extraction (Excluded Studies)

• Form B– ID– Source– Year– Title– Author– Criteria used for exclusion

35

Quality AssessmentQuality Assessment

• 1. Does the paper define the concept of testbed?• 2. Has the paper evaluated the testbed it

proposes?• 3. Does the testbed evaluation allows

replication?• 4. Does the study define a process for using the

testbed?• 5. Does the study point the benefits of having

and/or using a testbed?

• 1. Does the paper define the concept of testbed?• 2. Has the paper evaluated the testbed it

proposes?• 3. Does the testbed evaluation allows

replication?• 4. Does the study define a process for using the

testbed?• 5. Does the study point the benefits of having

and/or using a testbed?

36

Grading QualityGrading Quality

• Yes: question fully answered = 1.0 point• Partly: question is not fully answered = 0.5

point• No: question is not answered = 0.0 point

• Yes: question fully answered = 1.0 point• Partly: question is not fully answered = 0.5

point• No: question is not answered = 0.0 point

[0.0; 1.0][0.0; 1.0]

[1.5; 2.5][1.5; 2.5]

[3.0; 4.0][3.0; 4.0]

[4.5; 5.0][4.5; 5.0]

37

EXECUTION AND RESULTSEXECUTION AND RESULTSGeneral Data, Answers to Research Questions, Analysis and Discussion

AccountingAccounting

Potentially Relevant Studies Irr

elev

ant

Repl

icate

d/Du

plica

te

Inco

mpl

ete

Primary Studies

Perc

enta

ge

IEEE 2141 22 16 1 0 5 38,46%ACM 100 13 8 4 0 1 7,69%Citeseer 393 17 12 2 0 3 23,08%Scopus 1507 38 28 10 0 0 0,00%Springer 97 15 11 0 0 4 30,77%ESEJ 1 1 0 1 0 0 0,00%TOTAL 4239 106 75 18 0 13

Primary Studies Selection

Sources# Returned

Studies

#1 Selection#2 Selection

Excluded Included

39

AccountingAccounting

40

AuthorsAuthors

41

Studies DistributionStudies Distribution

42

Quality AssessmentQuality Assessment

43

?Research Question 1Research Question 1

Which specific software engineering fields are investigated

using testbeds?

Which specific software engineering fields are investigated

using testbeds?

44

Research Question 1Research Question 1

45

Research Question 1Research Question 1

46

?Research Question 2Research Question 2

Which elements the studies define for a testbed in software

engineering?

Which elements the studies define for a testbed in software

engineering?

47

Research Question 2Research Question 2

Step-Based Testbeds

• Clear ordered steps• Easier to understand• Guidelines

Block-Based Testbeds

• Blocks of activities• May present block

relations• Harder to understand

Step-Based Testbeds vs Block-Based TestbedsNew!

48

Example of Step-Based TestbedExample of Step-Based Testbed

• Mikael Lindvall et al. (2005) “An evolutionary testbed for software technology evaluation”:

1. Select application in relevant domain;2. Select family of technologies;3. Select one technology within the family;4. Prepare artifacts necessary for experimentation;5. Conduct study to create a testbed baseline;6. Define faults to be seeded;7. Design and conduct experiment;8. Analyze and document results from experiment;9. Improve testbed based on analysis;10. Improve technology based on analysis;11. Verify testbed usefulness;

49

Example of Block-Based TestbedExample of Block-Based Testbed

• Marcus Ciolkowski et al. (2002) “A family of experiments to investigate the influence of context on the effect of inspection techniques”

50

?Research Question 3Research Question 3

Which are the benefits of having a software engineering testbed?

Which are the benefits of having a software engineering testbed?

51

Research Question 3Research Question 3

20 benefits citedMost papers claim 2+ benefits

20 benefits citedMost papers claim 2+ benefits

52

Research Question 3Research Question 3

53

Most Cited Benefits (cited by 2+ papers)

Most Cited Benefits (cited by 2+ papers)

54

?Research Question 4Research Question 4

Which methods are used to evaluate the proposed testbed

Which methods are used to evaluate the proposed testbed

55

Research Question 4Research Question 4

56

Research Question 4Research Question 4

57

Analysis and DiscussionAnalysis and Discussion

61.54% of papers ≥ Good

Testbeds are versatile (16 research topics)

61.54% of papers ≥ Good

Testbeds are versatile (16 research topics)

58

Analysis and DiscussionAnalysis and Discussion

Controlled Experiment dominatesControlled Experiment dominates

59

Analysis and DiscussionAnalysis and Discussion

Testbeds are very different from each other

Half (10 of 20) of the benefits cited by 2+ papers

Testbeds are very different from each other

Half (10 of 20) of the benefits cited by 2+ papers

60

FINAL CONSIDERATIONSFINAL CONSIDERATIONSThreats to Validity, Future Work and Conclusions

Threats to ValidityThreats to Validity

Search strategySmall number of papers

Quality assessmentData extraction

Search strategySmall number of papers

Quality assessmentData extraction

62

Future WorkFuture Work

AO software maintenance testbed

Integration with the BF*

Extension to corporate environment

AO software maintenance testbed

Integration with the BF*

Extension to corporate environment

* Marcelo Moura (2008) “Um benchmarking framework para avaliação da manutenibilidade de software orientado a aspectos”

63

Future Work (improve the mapping)Future Work (improve the mapping)

Backward search

More thorough quality assessment

Backward search

More thorough quality assessment

64

ConclusionConclusion

4239 papers analyzed13 primary studiesProtocol reported

Scopus performed well

4239 papers analyzed13 primary studiesProtocol reported

Scopus performed well

65

ConclusionConclusion

Similar benefits

Testbeds are not well explored in SE

Testbed classification

Similar benefits

Testbeds are not well explored in SE

Testbed classification66

AcknowledgementsAcknowledgements

67

FriendsFamily

CInCNPQ

FriendsFamily

CInCNPQ

THANK YOU!THANK YOU!

68