References
description
Transcript of References
1
Experimentation in Computer Science and Software Engineering
Kavi KhedoSenior Lecturer
Department of Computer Science and Engineering Faculty of EngineeringUniversity of Mauritius
[email protected]://khedo.wordpress.com
2
References
Tichy, W.F., “Should Computer Scientists Experiment More ?”, IEEE Computer, May 1998
Zelkowitz, M.V, and Wallace, D.R., “Experimental Models for Validating Technology”, IEEE Computer, May 1998.
3
Outline
Nature of computingWhy experiment?Methods of experimentationIssues and possible approachesLooking aheadConclusion
4
Nature of Computing
Science or engineering? Computers and programs are human creations. CS not a natural science in the traditional sense.
Computers and software Subject of enquiry not just technical issues But models of information and information
processes.
5
Computer Science
“A science is any discipline in which the fool of this generation can go beyond the point reached by the genius of the last generation.”
Max GluckmanComputer science is a young and constantly
evolving discipline. It is therefore viewed in different ways by different people, leading to different perceptions of whether it is a “science” at all.
6
Modeling information processes
Are information processes artificial? Where and how do they occur?
Computer models compare poorly with information processes found in nature. e.g., nervous systems, immune systems, genetic
processes, brains of programmers and users, etc.
7
Why experiment ?Experiments don’t prove a thing !
View of mathematicians No amount of experimentation provides proof with absolute
certainty Show presence of errors but not their absence A theory can be shot down by contrary evidence
Test theoretical predictions against reality A theory gets accepted if all known facts in its domain can be
deduced from it and are verified by experiments e.g., astrophysics
8
Why experiment ?
Example of a failed theory: Failure probability of multi-version programs is
the product of the failure probabilities of individual versions.
Experiments by Knight and Leveson showed significantly higher failure than predicted.
False assumption detected by experiment: faults in program versions are statistically independent.
9
Why experiment ?
Another example: Artificial neural networks originally discarded
on theoretical grounds. Experiments showed properties better than
predicted. Now researchers have developed better theories
to explain what is observed.
10
Benefits of experimentation
Help build reliable base of knowledge. reduce uncertainty about adequacy of theories,
methods and tools.
Lead to new, useful and unexpected insights. open new areas of investigation.
Accelerate progress by eliminating fruitless approaches, erroneous assumptions and fads.
11
How to experiment
General categories of experiments: Scientific method. Engineering method. Empirical method.
12
Scientific method
Develop a theory to explain a phenomenon. Propose a hypothesis and test alternative
variations of it. Collect data to verify or refute claims of the
hypothesis.
13
Engineering method
Develop and test a solution to a hypothesis.Based on results of the test, improve the
solution. Iterate until no further improvement needed.
14
Empirical method
Statistical method proposed as a means to validate a hypothesis.
There may not be a formal model or theory describing the hypothesis.
Data collected to verify the hypothesis.
15
A comparison of the scientific method (on the left) with the role of experimentation in
system design (right).
16
Other important aspects
Replication Other researchers must be able to reproduce the
experiments.
Influence Impact of experimental design on the result.
Temporal properties Historical or current data? Is any required information missing?
17
Lack of validation in CS and SE
40% of papers requiring empirical evaluation had none. in a sample of 400 papers published by the ACM in
1993 50% in software related journals.
40-50% of SE papers found to be unvalidated. study by Zelkowitz and Wallace (Computer, May 1998)
Much smaller percentage in disciplines such as physics, psychology and anthropology.
18
Argument:Experiments do not prove anything.
Response:
True, experiments show only evidence for or against a theory, but cannot prove or disprove it.
However: experiments are used for theory testing, and
for exploration leading to theory development. Theory acceptance follows gradual community
acceptance as evidence accumulates (Note importance of repeatability)
19
Argument: Traditional scientific
method is not applicableResponse:
Applicability is identical, only the target object/subject changes
We’re dealing partly with human processes and
activities, these have clearly been amenable to experimentation in other disciplines
Likewise, encodings of processes (e.g. programs) can be investigated
20
Argument: The current level of experimentation is sufficient
Response:
Not when compared with other sciences• Tichy: 50% vs. 15% of unsupported claims• Zelkowitz/Wallace: 40% - 50% unvalidated papers
Note: Tichy is not advocating replacing theory and engineering by experiment, but advocating balance.
21
Argument: Experiments are
expensive Response:
So what!? Depends on the importance of the research questions,
some are clearly important enough. There’s a spectrum of experimental approaches differing
in cost from which to choose. Benchmarks could amortize costs. Other scientific disciplines accept this.
22
Cost of experiments
Require more resources than theory. So what ?
Example: A significant segment of software industry
switched from C to C++ at a substantial cost. No solid evidence to show that C++ is superior
to C for programmer productivity and software quality.
23
Benchmarks
A sample of the task domainEffective and affordable way to experimentWell-defined performance measurementsUsed in several areas:
Speech understanding, information retrieval, pattern recognition, data warehousing and OLAP, etc.
Help to eliminate unpromising approaches and exaggerated claims
24
Argument: Demos are sufficient
Demos provide proof-of-concepts in the engineering sense. Illustrate a potential, but depend on observers’ imagination and
extrapolation. Do not produce solid evidence. Not a substitute for the scientific process.
Satisfactory when presenting a radically new idea or a significant breakthrough. e.g., first compiler, time-sharing system, OO language, web
browser, etc.
Demos don’t investigate cause/effect, don’t provide (statistically) quantifiable results
25
Examples of questions for experimentation
Introduce theories of how requirements are refined into programs and test them.
Deeper understanding of what is intelligence.Quality of human computer interactions.Relative merits of parallel machine models
and algorithms.Behavior of algorithms on typical problems.
26
Argument: Too much noise (too
many variables to control)
Too many variables make experimentation hard.No more than in other fields, this is just lazinessHuman subjects experiments are particularly
difficult but other fields have developed many techniques for addressing these difficulties
Benchmarking can simplify many questions in CS.Benchmark development can help Composition of the benchmark is subjective, and so the
weakest link. Is the benchmark representative enough? Evolve over time to be close to what needs to be tested.
27
Argument: Progress will slow
(e.g. requiring experimentation with every paper will prevent ideas from emerging.)
We are wasting time by targeting unproductive research and development, productivity might actually improve given more experimentation.
There’s no reason for prohibiting conceptual papers and papers formulating new theories or hypotheses. (It’s a question of balance.)
28
Argument: Technology changes too
fastTechnology changes too fast, experiments are
nonrelevant by the time they’ve been completed.
Response:
Experiment focus is then too narrow Consider instead the bigger picture (e.g. fundamental
underlying questions, not ephemeral concerns.)
29
Argument: You’ll never get it
published.Response:
Can be true, especially when you run into reviewers who don’t understand empirical science!
But this has been changing. Still, a painful process of education in empirical research methods continues to be needed.
30
Potential Substitutes for Experimentation
Feature comparison Okay sometimes, but it isn’t science.
Intuition
There are plenty of examples of times when intuition has been wrong
Expert judgment Get real. Science is built on skepticism.
31
Concepts Vs Experiments
Rapid publication of novel concepts and new hypotheses is important.
But questionable ideas need to be weeded out by meaningful validation. Then scientists can concentrate on promising
approaches
Need for balance.
32
Problems with experiments
Unrealistic assumptions, manipulated dataFailure to provide details for repeating
experimentsResults over-interpreted, or do not
generaliseScientific process can self-correct errors,
hoaxes and even fraud.
33
CS as a harder scienceMost papers take small steps forward.Scientists should create models, formulate
hypotheses and test them using experiments.Competing theories: new theory replacing old lead
to paradigm shifts In physics, but not so evident in CS Physical symbol system theory Vs knowledge
processing theory in AI. A theory needed for behavior of algorithms on typical
problems.
34
ConclusionCS research used to rely far less on
experiments than most other disciplines.A good case exists for more
experimentation.Conventional scientific methods have made
CS a ‘hard’ science.Balance between theory, engineering and
experimentation needed.