382C Empirical Studies in Software Engineering Lecture...
Transcript of 382C Empirical Studies in Software Engineering Lecture...
1
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Introduction
Dewayne E PerryENS 623
2
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Course GoalsCreate/evaluate empirical studies/data
Understand underlying scientific and theoretical basesUnderstand fundamentals of experimental design
Independent and dependent variablesVariable manipulation/data gatheringVarious issues of validityEmpirical logic and reasoningAnalytic tools for reasoning about the experiment
Understand and control confounding variablesRandom and systematic biasesAlternative explanations
Understand and apply appropriate data analysis techniquesUnderstand and create the underlying logics of empirical studies in reasoning out conclusions
3
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Experimenter/Evaluator GoalsResponsible skepticism
Look forFailures in experimental designsFailures of observationsGaps in reasoningAlternative explanations
Compare new evidence against oldRaise counter objections/hypothesesQuestion grounds for doubt as wellAccumulate weight of evidence
4
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Good Research PracticesEnthusiasm – an enjoyable endeavorOpen-mindedness – keen, attentive, inquisitiveCommon sense – avoid looking under the lamppostInventiveness – creative
Not only in experimental workIn resource management as well
Confidence in ones own judgmentDespite detractors when rightKnow when you are wrong
Consistency and care about detailNo substitute for accuracy - keep complete records, organize and analyses accurately/carefully
Ability to communicateWriting is a superb, essential research techniqueMake your discoveries known to others
Honesty – integrity and scholarship
5
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Review of SWEFactors in software engineered products
Theory (basis for product)CS CoreDomain specific theory
Experience (basis for judgment)FeedbackEngineering experiments (prototypes)Empirical studies
ObservationsCorrelationsCausal connections
Process (basis for production)Methods and techniquesTechnologyOrganizational structures
TeamsProjectsCultures
6
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Review of SWESW development processes
PhasesRequirementsArchitectureDesignConstructionDeployment and maintenance
Integral to all phasesDocumentationMeasurement & AnalysisEvolutionTeamworkManagement of artifacts
7
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Daily Life of SWE: DecisionsDetermine what users want/needMake architecture, design and implementation decisionsEvaluate/compare architecture and design choicesEvaluate functional characteristicsEvaluate non-functional propertiesEvaluate/compare technologies
For supporting toolsFor product supportFor process support
Determine what went wrongDetermine good resource allocationsetc
8
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Some Illustrative ExamplesMost frequently stolen carLanguage/Solution comparisonsInspection reading techniquesExit Interviews
9
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Most Frequently Stolen CarHonda Accord
Study presented in mediaHonda most frequently stolen automobile
What does this study tell us?Actually very little
What can we infer from this?Shouldn’t buy a Honda?Buy a Mercedez instead?
Very misleadingMore recently
Frequency relative to the total number of carsClaims per 1000
Different storyLincoln Navigator – 12.2; Cadillac Escalade – 10.3; etc
All Escalades recovered – GPS systemHonda and Camry didn’t make it anywhere near the top 10
Percentage of thefts claims is low compared to the number on road
10
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Language/Solution ComparisonFrom a summer projectComparing Java/C2 different computers used
Pentium III 600MHz 128MC 1.4 times faster than Java
AMD 1GHz 256MJava 1.09 times faster than C
Conflicting evidenceHow do we account for it?What are the differences?How do we resolve these differences?What can we conclude, if anything?
11
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Preview of Confounding Variables
12
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Inspection Reading TechniquesExperiment: evaluate reading techniques for object oriented code inspectionsInconclusive resultsPossibilities:
There are no differencesPoor experimental designInsufficient dataPoor analyses and reasoning
13
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
SurveysExit Polls
Used to predict outcome2004 election
Significant disagreement with final voteEarly calls retracted
Pollster’s responseResults with in margin of errorData selectively reported
Possible biasesQuestions asked
slanted, non-uniform (ie general vs specific), etcTime of polls
7am - men on the way to work10am – soccer mom’s after kids are in school
Place of pollsEast Austin versus Westlake
Nature of volunteers
14
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Current State of SE Empirical WorkImplementation oriented
Fenton: poor statistical designs, don’t scaleBasili: differences in projects make comparisons difficultJohnson: practitioners resist measurement
Need to be more requirements orientedThink hard about what experiments really areHow they can be most effectively used
Core problem: conceptualizing and organizing a body of work as a scientific basis
15
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Software DevelopmentLittle hard evidence to inform decisions
Correlations suggestive but not sufficient in all casesMany times don’t know exception cases
Do not know fundamental mechanismsSoftware toolsMethods and techniquesProcesses
Empirical studies are the keyShow mechanismsEliminate alternative explanationsEmpirical validation is standard in some fieldsQuality of empirical studies in SE is risingFunding agencies recognizing value of empirical studiesIncreasing number of tutorials, panels, SOTAs, papers, etcKey consciousness raising papers (Tichy etal, Zelkowitz)Several key organizations: SEL, ISERN
16
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Systematic ProblemsResearch ideas are not empirically validated
Should retroactively validateShould proactively directed
Search for perfect study Instead of focus on credibility
Study the obviousOK, but we need deeper insights
Lots of dataNot enough – should answer important questions
Lack useful hypothesesLack conclusions from data
17
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Empirical Software Engineering Studies
Individual programmer studies have credibility due to well understood techniques from psychology and statistics.Large software development studies with the addition of large population social factors are not well established or credible.General goal:
Establish a spectrum of empirical techniques that are robust to large variances from social factors present.
18
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
ChallengesCreate better empirical studies
Establish principles that areCausal: correlated, temporally ordered, testable theoryActionable: causal agent effectively controllableGeneral: widely applicable
Answer important questionsFamily of focused studies – illuminate related aspectsCost effective and reproducible
Credible interpretationsDegree of confidence we have in conclusions
Eliminate alternative explanationsProvide a compelling logic in the discussion
Validity is critical: construct, internal, externalHypothesis is critical: ask important questionsResolutions appropriate to the intent of the studyMake the data public
19
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Some Concrete StepsDesigning studies
Ask significant questionsKnight-Leveson, N-version programming
Families of studiesSchneiderman et al, on the value of flowcharts
Build partnershipsTakes time; multi-person effort; interdisciplinary, industry
Long running in vivo/situ experimentsSubparts; subject rights; know when to stop
Collecting dataRetrospective artifact analysis
Eg, version management systemsSimulation and modeling
Eg, integration studies of Solheim and RowlandInvolving others
Meta-analysis – eg Porter and JohnsonEducational laboratories
Teach empirical studies basicsPopulate lab with appropriate data/designs/equipment
20
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Goals for SWE Empirical StudiesSome help
Look to other empirical disciplinesAdapt what is useful here
GoalsPerform better empirical studiesFocus on causal mechanismsGenerate theoriesIterate and improve
Good empirical studies enable us toEncode knowledge more rapidlyPrune low payoff ideas rapidlyRecognize and value high payoff ideasExploit important practical ideas
21
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
What is CriticalFundamentals
Credible interpretationsRepeatabilityUnderstanding validity limitsIdentifying underlying mechanismPractical significance
Non-fundamentalsWhether qualitative or quantitative
Both have their place and usefulnessIdentical results
Want congruent resultsCorrelation studies
Important precursor, but not the goalOpportunistic studies
22
382C Empirical Studies in Software Engineering
© 2000-present, Dewayne E Perry
Lecture 1
Structure of an Empirical StudyResearch context
Problem definitionResearch review
HypothesisAbstract – about the worldConcrete – about the design
Experimental designVariables – independent and dependentPlan to systematically manipulate variablesControl operational context
Threats to validity: construct, internal, externalData analysis and presentation
Quantitative: hypothesis testing, power analysisQualitative
Results and conclusionsLimits, influencesExplain how answered question and its practical significanceSufficient information for repeatability