A Systematic Mapping Study on Software Engineering Testbeds Emanoel Barreiros Advisor: Sérgio...
-
Upload
caroline-davis -
Category
Documents
-
view
217 -
download
1
Transcript of A Systematic Mapping Study on Software Engineering Testbeds Emanoel Barreiros Advisor: Sérgio...
A Systematic Mapping Study on Software Engineering TestbedsA Systematic Mapping Study on Software Engineering Testbeds
Emanoel Barreiros
Advisor: Sérgio Soares
System SolutionSystem Solution
Software+
Hardware+
Documentation (run, transform...)
Software+
Hardware+
Documentation (run, transform...)
2
SoftwareSoftware
Developed ≠ ProducedDeveloped ≠ Produced
ManufacturingHuman Creativity
ManufacturingHuman Creativity
3
Software EngineeringSoftware Engineering
Software runs on the real worldSoftware runs on the real world
Empirical StudiesEmpirical Studies
better evaluatebetter evaluate
predictpredictunderstandunderstand
controlcontrol
improveimprove 4
What do we need?What do we need?
GuidelinesExperience reports
ReplicationsComparisons
Systematic evaluation
GuidelinesExperience reports
ReplicationsComparisons
Systematic evaluation5
TestbedsTestbeds
Systematic evaluation of technologies
Systematic evaluation of technologies
Holibaugh et al. (1988) “Phase I Testbed Description: Requirements and Selection
Guidelines”
Holibaugh et al. (1988) “Phase I Testbed Description: Requirements and Selection
Guidelines”6
??Research ProblemResearch Problem
How can testbeds help researchers in evaluating technologies in SE? What
should be considered in order to build a new testbed in SE?
How can testbeds help researchers in evaluating technologies in SE? What
should be considered in order to build a new testbed in SE?
7
ObjectivesObjectives
Systematic mapping study
Facilitate creation of testbeds in SE
Foster empirical studies
Systematic mapping study
Facilitate creation of testbeds in SE
Foster empirical studies
8
Empirical StudiesEmpirical Studies
Systematic
Disciplined
Quantifiable
Controlled
Systematic
Disciplined
Quantifiable
Controlled12
DifficultiesDifficulties
Definition and encapsulationControl of variablesHuman involvement
High cost
Definition and encapsulationControl of variablesHuman involvement
High cost13
One Experimental FrameworkOne Experimental Framework
1. Definition: motivation, object, purpose, perspective, domain, scope;
2. Planning: design, criteria, measurement;3. Operation: preparation, execution, analysis;4. Interpretation: context, extrapolation,
impact;5. Packaging
1. Definition: motivation, object, purpose, perspective, domain, scope;
2. Planning: design, criteria, measurement;3. Operation: preparation, execution, analysis;4. Interpretation: context, extrapolation,
impact;5. Packaging
Basili et al. (1986) “Experimentation in software engineering”14
Why Testbeds in SE?Why Testbeds in SE?
Maturation may take 15+ years
Systematic studies Replicable, example
Lower risks
Easier knowledge transfer
Maturation may take 15+ years
Systematic studies Replicable, example
Lower risks
Easier knowledge transfer
15
SE Testbeds Usually HaveSE Testbeds Usually Have
• Assets repository
• Guidelines
• Mestrics suite
• Knowledge base
• Applications
• Assets repository
• Guidelines
• Mestrics suite
• Knowledge base
• Applications
16
SE Testbed ExampleSE Testbed Example
• Mikael Lidvall et al. (2007) “Experimenting with software testbeds for evaluating new technologies”– TSAFE tool– Soft. Architecture Evaluation– Seeded defects– Requirements specs., architecture docs., source
code, installation guide, flight data
17
Evidence-Based Software Engineering (EBSE)
Evidence-Based Software Engineering (EBSE)
Medical sciences
Stand on the shoulder of giants
SLRs and SMSs
Medical sciences
Stand on the shoulder of giants
SLRs and SMSs
18
A
”A
”““
EBSE GoalEBSE Goal
To provide the means by which current best evidence from research can be integrated with practical experience and human values in the
decision making process regarding the development and maintenance of software.
To provide the means by which current best evidence from research can be integrated with practical experience and human values in the
decision making process regarding the development and maintenance of software.
Kitchenham et al. (2004). “Evidence-based software engineering”
19
EBSE AdvantagesEBSE Advantages
Reduced bias
Wide range conclusions
Quantitative studies can detect real effects
Reduced bias
Wide range conclusions
Quantitative studies can detect real effects
20
SMSs ≠ SLRsSMSs ≠ SLRs
• SMSs have broader questions• Search terms are less focused • Data extraction is broader• Summaries rather than deep analyses• More limited than SLRs
• SMSs have broader questions• Search terms are less focused • Data extraction is broader• Summaries rather than deep analyses• More limited than SLRs
21
Classification According to CooperClassification According to Cooper
24
Harris M. Cooper (1988). “Organizing Knowledge Syntheses: A Taxonomy of Literature Reviews”
Research StepsResearch Steps
25
Research Steps
Research problemIdentify need for
an SMS
Define research questions
Develop protocol
Review protocol
Primary studies
selectionData extraction
Report writing
Dissemination
Selection/Evaluation ProcessSelection/Evaluation Process
Evaluation process
Set of Studies (not evaluated)
T1 - results
Tn
T1
Tn - results
AgreementDisagreement Table
Set of Studies (evaluated)
Conflict Resolution
meeting
...
26
ScopeScope
• Population: Published literature that report the definition of software engineering testbeds;
• Intervention: Empirical studies that defines a SE testbed;
• Comparison: N/A;• Outcomes: Quantity and type of evidence regarding
the empirical evaluation of software engineering technologies;
• Context: Empirical studies on SE are run mainly in academic environments, which characterizes the current mapping study as a being of academic context.
27
Research QuestionsResearch Questions
• RQ1: Which specific software engineering fields are investigated using testbeds?
• RQ2: Which elements the studies define for a testbed in software engineering?
• RQ3: Which are the benefits of having a software engineering testbed?
• RQ4: Which methods are used to evaluate the proposed testbed?
• RQ1: Which specific software engineering fields are investigated using testbeds?
• RQ2: Which elements the studies define for a testbed in software engineering?
• RQ3: Which are the benefits of having a software engineering testbed?
• RQ4: Which methods are used to evaluate the proposed testbed?
28
Automated SearchAutomated Search
• IEEE Computer Society Digital Library• ACM Digital Library• Scopus• Citeseer• Springer Link
• IEEE Computer Society Digital Library• ACM Digital Library• Scopus• Citeseer• Springer Link
29
Manual Search (past 12 months)Manual Search (past 12 months)
• International Conference on Software Engineering (ICSE), 2010
• Empirical Software Engineering and Measurement (ESEM), 2009
• Evaluation and Assessment in Software Engineering (EASE), 2010
• Empirical Software Engineering Journal (June 2009 to June 2010)
• International Conference on Software Engineering (ICSE), 2010
• Empirical Software Engineering and Measurement (ESEM), 2009
• Evaluation and Assessment in Software Engineering (EASE), 2010
• Empirical Software Engineering Journal (June 2009 to June 2010)
30
Search StringSearch String
“software engineering” AND (testbed OR “family of experiments” OR “technology evaluation” OR “technology
comparison” OR “technology transition” OR “software adoption” OR “software maturity” OR “replicability” OR “replicate” OR
“replication” OR “experiment protocol” OR “study protocol” OR “experiment process” OR “experiment guidelines” OR “study
process” OR comparability OR comparable OR compare)
“software engineering” AND (testbed OR “family of experiments” OR “technology evaluation” OR “technology
comparison” OR “technology transition” OR “software adoption” OR “software maturity” OR “replicability” OR “replicate” OR
“replication” OR “experiment protocol” OR “study protocol” OR “experiment process” OR “experiment guidelines” OR “study
process” OR comparability OR comparable OR compare)
31
Selection Criteria – Phase 1Selection Criteria – Phase 1
Title, keywords and abstracts
2 researchers for each paper
Remove out of scope papers
Title, keywords and abstracts
2 researchers for each paper
Remove out of scope papers
32
Selection Criteria – Phase 2Selection Criteria – Phase 2
Full paper
Real
Not a duplicate
Defines a testbed
Related to SE
Full paper
Real
Not a duplicate
Defines a testbed
Related to SE 33
Data Extraction (Primary Studies)Data Extraction (Primary Studies)
Form A• ID• Source• Year• Title• Author• Institution• Country
Form C• Evaluation Date• Researcher• Research Question 1• Research Question 2• Research Question 3• Research Question 4• Notes
34
Data Extraction (Excluded Studies)Data Extraction (Excluded Studies)
• Form B– ID– Source– Year– Title– Author– Criteria used for exclusion
35
Quality AssessmentQuality Assessment
• 1. Does the paper define the concept of testbed?• 2. Has the paper evaluated the testbed it
proposes?• 3. Does the testbed evaluation allows
replication?• 4. Does the study define a process for using the
testbed?• 5. Does the study point the benefits of having
and/or using a testbed?
• 1. Does the paper define the concept of testbed?• 2. Has the paper evaluated the testbed it
proposes?• 3. Does the testbed evaluation allows
replication?• 4. Does the study define a process for using the
testbed?• 5. Does the study point the benefits of having
and/or using a testbed?
36
Grading QualityGrading Quality
• Yes: question fully answered = 1.0 point• Partly: question is not fully answered = 0.5
point• No: question is not answered = 0.0 point
• Yes: question fully answered = 1.0 point• Partly: question is not fully answered = 0.5
point• No: question is not answered = 0.0 point
[0.0; 1.0][0.0; 1.0]
[1.5; 2.5][1.5; 2.5]
[3.0; 4.0][3.0; 4.0]
[4.5; 5.0][4.5; 5.0]
37
EXECUTION AND RESULTSEXECUTION AND RESULTSGeneral Data, Answers to Research Questions, Analysis and Discussion
AccountingAccounting
Potentially Relevant Studies Irr
elev
ant
Repl
icate
d/Du
plica
te
Inco
mpl
ete
Primary Studies
Perc
enta
ge
IEEE 2141 22 16 1 0 5 38,46%ACM 100 13 8 4 0 1 7,69%Citeseer 393 17 12 2 0 3 23,08%Scopus 1507 38 28 10 0 0 0,00%Springer 97 15 11 0 0 4 30,77%ESEJ 1 1 0 1 0 0 0,00%TOTAL 4239 106 75 18 0 13
Primary Studies Selection
Sources# Returned
Studies
#1 Selection#2 Selection
Excluded Included
39
?Research Question 1Research Question 1
Which specific software engineering fields are investigated
using testbeds?
Which specific software engineering fields are investigated
using testbeds?
44
?Research Question 2Research Question 2
Which elements the studies define for a testbed in software
engineering?
Which elements the studies define for a testbed in software
engineering?
47
Research Question 2Research Question 2
Step-Based Testbeds
• Clear ordered steps• Easier to understand• Guidelines
Block-Based Testbeds
• Blocks of activities• May present block
relations• Harder to understand
Step-Based Testbeds vs Block-Based TestbedsNew!
48
Example of Step-Based TestbedExample of Step-Based Testbed
• Mikael Lindvall et al. (2005) “An evolutionary testbed for software technology evaluation”:
1. Select application in relevant domain;2. Select family of technologies;3. Select one technology within the family;4. Prepare artifacts necessary for experimentation;5. Conduct study to create a testbed baseline;6. Define faults to be seeded;7. Design and conduct experiment;8. Analyze and document results from experiment;9. Improve testbed based on analysis;10. Improve technology based on analysis;11. Verify testbed usefulness;
49
Example of Block-Based TestbedExample of Block-Based Testbed
• Marcus Ciolkowski et al. (2002) “A family of experiments to investigate the influence of context on the effect of inspection techniques”
50
?Research Question 3Research Question 3
Which are the benefits of having a software engineering testbed?
Which are the benefits of having a software engineering testbed?
51
Research Question 3Research Question 3
20 benefits citedMost papers claim 2+ benefits
20 benefits citedMost papers claim 2+ benefits
52
?Research Question 4Research Question 4
Which methods are used to evaluate the proposed testbed
Which methods are used to evaluate the proposed testbed
55
Analysis and DiscussionAnalysis and Discussion
61.54% of papers ≥ Good
Testbeds are versatile (16 research topics)
61.54% of papers ≥ Good
Testbeds are versatile (16 research topics)
58
Analysis and DiscussionAnalysis and Discussion
Controlled Experiment dominatesControlled Experiment dominates
59
Analysis and DiscussionAnalysis and Discussion
Testbeds are very different from each other
Half (10 of 20) of the benefits cited by 2+ papers
Testbeds are very different from each other
Half (10 of 20) of the benefits cited by 2+ papers
60
Threats to ValidityThreats to Validity
Search strategySmall number of papers
Quality assessmentData extraction
Search strategySmall number of papers
Quality assessmentData extraction
62
Future WorkFuture Work
AO software maintenance testbed
Integration with the BF*
Extension to corporate environment
AO software maintenance testbed
Integration with the BF*
Extension to corporate environment
* Marcelo Moura (2008) “Um benchmarking framework para avaliação da manutenibilidade de software orientado a aspectos”
63
Future Work (improve the mapping)Future Work (improve the mapping)
Backward search
More thorough quality assessment
Backward search
More thorough quality assessment
64
ConclusionConclusion
4239 papers analyzed13 primary studiesProtocol reported
Scopus performed well
4239 papers analyzed13 primary studiesProtocol reported
Scopus performed well
65
ConclusionConclusion
Similar benefits
Testbeds are not well explored in SE
Testbed classification
Similar benefits
Testbeds are not well explored in SE
Testbed classification66