Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Scientific Communication CITS7200

Lecture 9Designing and Analysing

Experiments

• Scientific method concerns hypothesis, experiment, and validation

• A hypothesis is a supposition or conjecture put forth to account for known facts

• Until it is tested, it is not a theory, nor a law

• Hypothesis: A logical but unproven explanation for a given set of facts

• It is an “educated guess”

• Observation: I observe that I always have exactly one unmatched sock in my laundry basket coming out of my drier

Example from Greg Anderson

Hypotheses• The CIA takes them from my basket• Someone on board the SS Enterprise

beams them from my basket• The drier eats them• The drier is a gateway that sucks socks

into another dimension• Someone in the house wears only one

sock at a time• Someone in the house can’t throw both

socks into the basket

• Theory: A hypothesis which has been tested numerous times and found to explain previous observations and make accurate predictions about future observations

• To a scientist, you have a theory when there is no more reasonable doubt

• Lay person: theory = guess

• Law: A theory that has been tested extensively and has never been disproven in any test

• In computing, experiments are used to verify hypotheses about algorithms

• The hypotheses might relate to the performance of a specific task, or to the computational efficiency or scalability

• Experiments must be repeatable

Reader can expect some of

• The steps that make up the algorithm

• The input and output, and the internal data structures used by the algorithm

• The scope of application of the algorithm and its limitations

• The properties that will allow demonstration of correctness, such as preconditions, postconditions, and loop invariants

• A demonstration of correctness• A complexity analysis, for both space

and time requirements• Experiments confirming the theoretical

results

Stating the hypothesis

• Must be clear, precise and unambiguous

• Clarify what is not being tested– i.e. limit the hypothesis to what is

interesting and provable

Example• Suppose P-lists are a well-known data

structure used in a range of applications, but particularly as an in-memory search structure that is fast and compact. Suppose you develop a new data structure called Q-lists. You do a formal complexity analysis and discover that both structures exhibit the same asymptotic complexity in space and time, but on all the tests you have run, the Q-lists perform better than the P-lists.

• Q-lists are superior to P-lists• Such a hypothesis must apply in all

cases, in all applications, under all conditions, and for all time

• Can never be verified experimentally

• A single counterexample destroys it

• As an in-memory search structure for large data sets, Q-lists are faster and more compact than P-lists

Occam’s Razor

• One should not increase, beyond what is necessary, the number of entities required to explain anything

• "Pluralitas non est ponenda sine neccesitate“

• “Entities should not be multiplied unnecessarily"

• If you have two theories that both explain the observed facts, then you should use the simpler theory until more evidence comes along– Conjurors– Crop circles– UFOs

• The Razor doesn't tell us anything about the truth or otherwise of a hypothesis, but rather it tells us which one to test first

• The simpler the hypothesis, the easier it is to shoot down

Developing hypotheses

• Original hypothesis might not be correct

• Don’t alter incrementally• Q-lists are superior to P-lists, except for

data sets of size greater than 2,000,000

• Hypothesis is confirmed once it can make successful predictions

Distinguish between

• The algorithm worked on our data• The algorithm was predicted to

work on any data in this class, and this prediction has been confirmed on our data

• Tests should be blind• Hypothesis should not be tested on

the data from which it arose

Argumentation

• Need an argument relating your hypothesis to the evidence

• First identify the conclusion• Next identify the premises• Is the argument valid?

• An argument is valid iff its premises logically imply its conclusions - that is, if it is impossible for its conclusions to be false if its premises are true

• An argument is also sound if additionally its premises are true

• Today is Friday and thus tomorrow is a holiday.

• Valid but not sound

• Outside of mathematics it is very difficult to develop arguments that stand up to the rigour of logical analysis

• Often difficult or impossible to list all the premises

• Principle of Charity: if the step from an invalid argument to a valid argument requires a suppressed premise that is known to be true, or likely to be true, or non-controversial, then assume that the argument contains that premise

• Principle of Minimality: if the argument needs only some particular assumption to become valid, then do not assume anything beyond that

• A valid argument can fail to be sound in only one way: one of its premises is false

• Play the adversary with your own arguments

Example

• The new string hashing argument is fast because it doesn't use multiplication or division

• Why are multiplication and division an issue?

• On most machines they use several cycles, or they may not be implemented in hardware at all. The new algorithm instead uses two exclusive-or operations per character and a modulo in the final step. I agree that for pipelined machines with floating-point accelerators the difference might not be great.

• Modulo isn't always in hardware either.

• True, but it is required only once.

• So there is also an array lookup? That can be slow.

• Not if the array is cache resident.

• What happens if the hash table size is not 28?

• Good point. This function is most effective for hash tables of size 28, 216 etc.

• Are there any improbable consequences if your hypothesis is true?

• Does your hypothesis displace or contradict some currently-held belief?

• Does your hypothesis cover all the observations explained by some currently-held belief?

Scriven’s steps of analysis

• Clarification of meanings, including the use of dictionaries and encyclopaedias where appropriate. Removing ambiguities of terms. Identifying and labelling propositions.

• Identification of conclusions, stated and unstated.

• Portrayal of the argument in graphical form.

• Formulation of unstated premises. This is the most difficult step: it requires creativity and relevant background knowledge to find the premises most appropriate to produce a valid argument. This is also the step most prone to bias and emotion.

• Criticism of the inferences and premises. Try to find counterexamples or, in the case of inductive arguments, evidence of bias.

• Introduction of other relevant arguments.

• Overall evaluation. How good is your argument? How much would you bet on it? At what odds?

Inductive arguments

• Deductive arguments require the extremely strong condition that it is strictly impossible for the conclusion to be false.

• Inductive arguments are more common in science, and still require analysis.

Sherlock Holmes

• Holmes deduces that the passenger sitting opposite him in the train is a teacher, because he has chalk on his hands.

The argument

1. This man has white powder on his hands.

2. White powder that looks like chalk is chalk.

3. This man has chalk on his hands. 4. Most people with chalk on their

hands are teachers. 5. This man is a teacher.

• As a deductive argument, it is flawed.

• As an inductive argument at Holmes’ time, it is not bad.

• To criticise the inductive strength of an argument you must show that the truth of the premises would fail to make the conclusion highly probable.

Types of inductive argument

• Direct inference This involves inferring from the proportion of individuals in a class having some property the probability of a particular individual having that property. Such an inference is defeated by an additional biasing factor.

• Inverse inference This inference proceeds in the reverse direction from a sample of individuals to characteristics of the whole population. It is common in political surveys. Such inferences can be defeated if the sampling procedure can be shown to be biased.

• Analogies Analogies provide the basis for many inductive arguments. By drawing attention to similarities between structural or law-like features of two systems, some support may be found for claiming that a further, unobserved feature of one is likely to be similar to a corresponding feature of the other. Such arguments are defeated by the use of disanalogies, that is by pointing out how the systems differ.

• Correlations Some kinds of inductive reasoning which may support or undermine causal theories invoke the existence of (or failure of) correlations between observed variables. Such arguments can get very complicated and need to be treated with care.

Bayesian Evaluation Theory

• Bayes' theorem gives the relation between the probability of a hypothesis (conclusion) given its evidence, P(h|e), and its probability prior to any evidence, P(h), and its likelihood, P(e|h).

• Make sure you don’t confuse P(e|h) and P(h|e)

Assume that x is a person

P(x speaks English) 15%P(x is Australian) 0.3%P(x speaks English | x is Australian) ?P(x is Australian | x speaks English) ?

P(h|e) × P(e) = P(e|h) × P(h)

P(h|e) = [P(e|h) × P(h)] / P(e)

• When several alternative hypotheses are competing for our belief, we test them by deducing consequences of each one, then conducting experimental tests to observe whether or not those consequences actually occur

• If a hypothesis predicts that something should occur, and that thing does occur, it strengthens our belief in the truthfulness of the hypothesis. Conversely, an observation that contradicts the prediction would weaken (or destroy) our confidence in the hypothesis

• Often predictions involve probabilities: one hypothesis might predict a certain outcome has a 30% chance of occurring, while a competing hypothesis might predict a 50% chance of the same outcome.

• The occurrence or non-occurrence of the outcome would shift our relative degree of belief from one hypothesis toward another. Bayes theorem provides a way to calculate these "degree of belief" adjustments.

Example

• Suppose a woman is the daughter of a carrier of haemophilia, and therefore is known to have a 50/50 chance of being a carrier herself. If she subsequently has a normal child, how does this affect the likelihood that she is a carrier?

• Originally Pr(C) = Pr (NC) = 50%• Pr(C|N) = Pr(N|C) x Pr(C) / Pr(N)

= 0.5 x 0.5 / 0.75 = 33%• Pr(NC|N) = Pr(N|NC) x Pr(NC) / Pr(N)

= 1.0 x 0.5 / 0.75 = 66%• Thus, after the evidence her

chances are now 66% of being NC

Evidence will support a hypothesis if and only if the likelihood ratio

(e|h) = P(e|h) / P(e|~h)

is greater than one

• Imagine a disease which occurs in 0.1% of the population

• A test for this disease is 99% accurate– P(positive test | ill) = 0.99– P(negative test | healthy) = 0.99

• What is P(ill | positive test)?

• Consider a sample of 100,000 people

• 100 ill people have 99 positive tests

• 99,900 healthy people have 999 positive tests

• So P(ill | positive test) = 99/(99+999) 9%

• Remember P(positive test | ill) = 99%!

Logical fallacies

• Equivocation Equivocation occurs when a term in an argument can have more than one meaning, and the argument relies on the appearance of validity of a term that is actually being used in another way

• Killing is immoral; every religion and system of law acknowledges this

• Every system of law and religion treats complicity in a killing as immoral also

• Eating meat is complicity in a killing• Therefore, eating meat is immoral

• Affirming the consequent A valid deductive inference is modus ponens:

1.If A, then B2.A3.Therefore, B

• It is possible to run the conditional in reverse, by denying the consequent; this results in modus tollens:

1. If A, then B2. Not B3. Therefore, not A

1. If you wish to compete, you must have completed a negative drug test

2. You did not complete the test3. Therefore, you cannot compete

• Claim: Every card with a vowel on one side has an even number on the other

• Given four cards displaying C, E, 8, 3

• Which card or cards needs to be turned over to verify the claim?

• Affirming the consequent attempts to use modus ponens in reverse, but without denying the consequent. It is blatantly illogical, and yet there are times when it makes sense

1. If A, then B2. B3. Therefore, A

Affirming the consequent

• “I do not tell you that Schlesinger, Stevensen’s number one man, number one brain trust, I don’t tell you he’s a Communist. I have no information on that point. But I do know that if he were a Communist he would also ridicule religion as Schlesinger has done.”

• Red Herrings and Straw Men

• Red herrings are arguments or facts that are irrelevant to the main issue under discussion but are brought up in order to distract people from that issue

• In arguing that Schlesinger is a Communist, when challenged McCarthy goes on to say “Communists are taking over Washington and we must do everything we can to stop them.”

• Straw men are what you invent when you violate the principle of charity: the attribution to your opponents of arguments which they in fact do not endorse

• Appeals to Authority Improper appeals to authority occur when • the authority invoked pertains to

some other domain;

(Sports stars endorse headache tablets)

• the expert opinion is widely disputed by other experts in the field;

(Smoking doesn’t cause cancer)

• the expert has a history of offering unreliable opinions;

(Cancer researcher working for a tobacco company)

• Ad Hominem Ad hominem attacks are those against the person rather than the argument

• Stereotyped Reasoning and Prejudice

Everybody argues using stereotypes, and mostly this is not a bad thing. Inappropriate uses of stereotypes involve the failure to adjust them in the light of new information. Such behaviour is called prejudice.

• For more examples (there are many!):

www.philosophicalsociety.com/Logical Fallacies.htm

www.skepticreport.com/tools/logicfallacies.htm

www.fallacyfiles.org

www.adamsmith.org/logicalfallacies

ProofsforP.pdf

Evidence

• analysis or proof• modelling• simulation • experiment

• An analysis or proof is a formal argument that the hypothesis is correct

• A model is a mathematical description of the hypothesis

• Usually there is a demonstration that the hypothesis and the model correspond

• A simulation is usually an implementation or partial implementation of a simplified form of the hypothesis, in which the difficulties of a full implementation are side stepped by omission or approximation

• An experiment is a full test of the hypothesis, based on an implementation of the proposal and on real (or at least realistic) data

Fair experiments

• Tests should be fair – not designed to support your hypothesis

Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Documents

Transcript of Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.