Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

83
Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments

Transcript of Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Page 1: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Scientific Communication CITS7200

Lecture 9Designing and Analysing

Experiments

Page 2: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Scientific method concerns hypothesis, experiment, and validation

• A hypothesis is a supposition or conjecture put forth to account for known facts

• Until it is tested, it is not a theory, nor a law

Page 3: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Hypothesis: A logical but unproven explanation for a given set of facts

• It is an “educated guess”

Page 4: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Observation: I observe that I always have exactly one unmatched sock in my laundry basket coming out of my drier

Example from Greg Anderson

Page 5: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Hypotheses• The CIA takes them from my basket• Someone on board the SS Enterprise

beams them from my basket• The drier eats them• The drier is a gateway that sucks socks

into another dimension• Someone in the house wears only one

sock at a time• Someone in the house can’t throw both

socks into the basket

Page 6: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Theory: A hypothesis which has been tested numerous times and found to explain previous observations and make accurate predictions about future observations

• To a scientist, you have a theory when there is no more reasonable doubt

• Lay person: theory = guess

Page 7: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Law: A theory that has been tested extensively and has never been disproven in any test

Page 8: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• In computing, experiments are used to verify hypotheses about algorithms

• The hypotheses might relate to the performance of a specific task, or to the computational efficiency or scalability

• Experiments must be repeatable

Page 9: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Reader can expect some of

• The steps that make up the algorithm

• The input and output, and the internal data structures used by the algorithm

• The scope of application of the algorithm and its limitations

Page 10: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• The properties that will allow demonstration of correctness, such as preconditions, postconditions, and loop invariants

• A demonstration of correctness• A complexity analysis, for both space

and time requirements• Experiments confirming the theoretical

results

Page 11: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Stating the hypothesis

• Must be clear, precise and unambiguous

• Clarify what is not being tested– i.e. limit the hypothesis to what is

interesting and provable

Page 12: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Example• Suppose P-lists are a well-known data

structure used in a range of applications, but particularly as an in-memory search structure that is fast and compact. Suppose you develop a new data structure called Q-lists. You do a formal complexity analysis and discover that both structures exhibit the same asymptotic complexity in space and time, but on all the tests you have run, the Q-lists perform better than the P-lists.

Page 13: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Q-lists are superior to P-lists• Such a hypothesis must apply in all

cases, in all applications, under all conditions, and for all time

• Can never be verified experimentally

• A single counterexample destroys it

Page 14: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• As an in-memory search structure for large data sets, Q-lists are faster and more compact than P-lists

Page 15: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Occam’s Razor

• One should not increase, beyond what is necessary, the number of entities required to explain anything

• "Pluralitas non est ponenda sine neccesitate“

• “Entities should not be multiplied unnecessarily"

Page 16: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• If you have two theories that both explain the observed facts, then you should use the simpler theory until more evidence comes along– Conjurors– Crop circles– UFOs

Page 17: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• The Razor doesn't tell us anything about the truth or otherwise of a hypothesis, but rather it tells us which one to test first

• The simpler the hypothesis, the easier it is to shoot down

Page 18: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Developing hypotheses

• Original hypothesis might not be correct

• Don’t alter incrementally• Q-lists are superior to P-lists, except for

data sets of size greater than 2,000,000

• Hypothesis is confirmed once it can make successful predictions

Page 19: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Distinguish between

• The algorithm worked on our data• The algorithm was predicted to

work on any data in this class, and this prediction has been confirmed on our data

Page 20: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Tests should be blind• Hypothesis should not be tested on

the data from which it arose

Page 21: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Argumentation

• Need an argument relating your hypothesis to the evidence

• First identify the conclusion• Next identify the premises• Is the argument valid?

Page 22: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• An argument is valid iff its premises logically imply its conclusions - that is, if it is impossible for its conclusions to be false if its premises are true

• An argument is also sound if additionally its premises are true

Page 23: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Today is Friday and thus tomorrow is a holiday.

• Valid but not sound

Page 24: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Outside of mathematics it is very difficult to develop arguments that stand up to the rigour of logical analysis

• Often difficult or impossible to list all the premises

Page 25: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Principle of Charity: if the step from an invalid argument to a valid argument requires a suppressed premise that is known to be true, or likely to be true, or non-controversial, then assume that the argument contains that premise

Page 26: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Principle of Minimality: if the argument needs only some particular assumption to become valid, then do not assume anything beyond that

Page 27: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• A valid argument can fail to be sound in only one way: one of its premises is false

• Play the adversary with your own arguments

Page 28: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Example

• The new string hashing argument is fast because it doesn't use multiplication or division

Page 29: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Why are multiplication and division an issue?

• On most machines they use several cycles, or they may not be implemented in hardware at all. The new algorithm instead uses two exclusive-or operations per character and a modulo in the final step. I agree that for pipelined machines with floating-point accelerators the difference might not be great.

Page 30: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Modulo isn't always in hardware either.

• True, but it is required only once.

Page 31: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• So there is also an array lookup? That can be slow.

• Not if the array is cache resident.

Page 32: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• What happens if the hash table size is not 28?

• Good point. This function is most effective for hash tables of size 28, 216 etc.

Page 33: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Are there any improbable consequences if your hypothesis is true?

• Does your hypothesis displace or contradict some currently-held belief?

• Does your hypothesis cover all the observations explained by some currently-held belief?

Page 34: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Scriven’s steps of analysis

• Clarification of meanings, including the use of dictionaries and encyclopaedias where appropriate. Removing ambiguities of terms. Identifying and labelling propositions.

Page 35: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Identification of conclusions, stated and unstated.

Page 36: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Portrayal of the argument in graphical form.

Page 37: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Formulation of unstated premises. This is the most difficult step: it requires creativity and relevant background knowledge to find the premises most appropriate to produce a valid argument. This is also the step most prone to bias and emotion.

Page 38: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Criticism of the inferences and premises. Try to find counterexamples or, in the case of inductive arguments, evidence of bias.

Page 39: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Introduction of other relevant arguments.

Page 40: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Overall evaluation. How good is your argument? How much would you bet on it? At what odds?

Page 41: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Inductive arguments

• Deductive arguments require the extremely strong condition that it is strictly impossible for the conclusion to be false.

• Inductive arguments are more common in science, and still require analysis.

Page 42: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Sherlock Holmes

• Holmes deduces that the passenger sitting opposite him in the train is a teacher, because he has chalk on his hands.

Page 43: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

The argument

1. This man has white powder on his hands.

2. White powder that looks like chalk is chalk.

3. This man has chalk on his hands. 4. Most people with chalk on their

hands are teachers. 5. This man is a teacher.

Page 44: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• As a deductive argument, it is flawed.

• As an inductive argument at Holmes’ time, it is not bad.

Page 45: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• To criticise the inductive strength of an argument you must show that the truth of the premises would fail to make the conclusion highly probable.

Page 46: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Types of inductive argument

• Direct inference This involves inferring from the proportion of individuals in a class having some property the probability of a particular individual having that property. Such an inference is defeated by an additional biasing factor.

Page 47: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Inverse inference This inference proceeds in the reverse direction from a sample of individuals to characteristics of the whole population. It is common in political surveys. Such inferences can be defeated if the sampling procedure can be shown to be biased.

Page 48: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Analogies Analogies provide the basis for many inductive arguments. By drawing attention to similarities between structural or law-like features of two systems, some support may be found for claiming that a further, unobserved feature of one is likely to be similar to a corresponding feature of the other. Such arguments are defeated by the use of disanalogies, that is by pointing out how the systems differ.

Page 49: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Correlations Some kinds of inductive reasoning which may support or undermine causal theories invoke the existence of (or failure of) correlations between observed variables. Such arguments can get very complicated and need to be treated with care.

Page 50: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Bayesian Evaluation Theory

• Bayes' theorem gives the relation between the probability of a hypothesis (conclusion) given its evidence, P(h|e), and its probability prior to any evidence, P(h), and its likelihood, P(e|h).

Page 51: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Make sure you don’t confuse P(e|h) and P(h|e)

Assume that x is a person

P(x speaks English) 15%P(x is Australian) 0.3%P(x speaks English | x is Australian) ?P(x is Australian | x speaks English) ?

Page 52: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

P(h|e) × P(e) = P(e|h) × P(h)

P(h|e) = [P(e|h) × P(h)] / P(e)

Page 53: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• When several alternative hypotheses are competing for our belief, we test them by deducing consequences of each one, then conducting experimental tests to observe whether or not those consequences actually occur

Page 54: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• If a hypothesis predicts that something should occur, and that thing does occur, it strengthens our belief in the truthfulness of the hypothesis. Conversely, an observation that contradicts the prediction would weaken (or destroy) our confidence in the hypothesis

Page 55: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Often predictions involve probabilities: one hypothesis might predict a certain outcome has a 30% chance of occurring, while a competing hypothesis might predict a 50% chance of the same outcome.

• The occurrence or non-occurrence of the outcome would shift our relative degree of belief from one hypothesis toward another. Bayes theorem provides a way to calculate these "degree of belief" adjustments.

Page 56: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Example

• Suppose a woman is the daughter of a carrier of haemophilia, and therefore is known to have a 50/50 chance of being a carrier herself. If she subsequently has a normal child, how does this affect the likelihood that she is a carrier?

Page 57: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Originally Pr(C) = Pr (NC) = 50%• Pr(C|N) = Pr(N|C) x Pr(C) / Pr(N)

= 0.5 x 0.5 / 0.75 = 33%• Pr(NC|N) = Pr(N|NC) x Pr(NC) / Pr(N)

= 1.0 x 0.5 / 0.75 = 66%• Thus, after the evidence her

chances are now 66% of being NC

Page 58: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Evidence will support a hypothesis if and only if the likelihood ratio

(e|h) = P(e|h) / P(e|~h)

is greater than one

Page 59: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Imagine a disease which occurs in 0.1% of the population

• A test for this disease is 99% accurate– P(positive test | ill) = 0.99– P(negative test | healthy) = 0.99

• What is P(ill | positive test)?

Page 60: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Consider a sample of 100,000 people

• 100 ill people have 99 positive tests

• 99,900 healthy people have 999 positive tests

• So P(ill | positive test) = 99/(99+999) 9%

• Remember P(positive test | ill) = 99%!

Page 61: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Logical fallacies

• Equivocation Equivocation occurs when a term in an argument can have more than one meaning, and the argument relies on the appearance of validity of a term that is actually being used in another way

Page 62: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Killing is immoral; every religion and system of law acknowledges this

• Every system of law and religion treats complicity in a killing as immoral also

• Eating meat is complicity in a killing• Therefore, eating meat is immoral

Page 63: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Affirming the consequent A valid deductive inference is modus ponens:

1.If A, then B2.A3.Therefore, B

Page 64: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• It is possible to run the conditional in reverse, by denying the consequent; this results in modus tollens:

1. If A, then B2. Not B3. Therefore, not A

Page 65: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

1. If you wish to compete, you must have completed a negative drug test

2. You did not complete the test3. Therefore, you cannot compete

Page 66: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Claim: Every card with a vowel on one side has an even number on the other

• Given four cards displaying C, E, 8, 3

• Which card or cards needs to be turned over to verify the claim?

Page 67: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Affirming the consequent attempts to use modus ponens in reverse, but without denying the consequent. It is blatantly illogical, and yet there are times when it makes sense

1. If A, then B2. B3. Therefore, A

Page 68: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Affirming the consequent

• “I do not tell you that Schlesinger, Stevensen’s number one man, number one brain trust, I don’t tell you he’s a Communist. I have no information on that point. But I do know that if he were a Communist he would also ridicule religion as Schlesinger has done.”

Page 69: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Red Herrings and Straw Men

• Red herrings are arguments or facts that are irrelevant to the main issue under discussion but are brought up in order to distract people from that issue

Page 70: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• In arguing that Schlesinger is a Communist, when challenged McCarthy goes on to say “Communists are taking over Washington and we must do everything we can to stop them.”

Page 71: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Straw men are what you invent when you violate the principle of charity: the attribution to your opponents of arguments which they in fact do not endorse

Page 72: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Appeals to Authority Improper appeals to authority occur when • the authority invoked pertains to

some other domain;

(Sports stars endorse headache tablets)

Page 73: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• the expert opinion is widely disputed by other experts in the field;

(Smoking doesn’t cause cancer)

Page 74: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• the expert has a history of offering unreliable opinions;

(Cancer researcher working for a tobacco company)

Page 75: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Ad Hominem Ad hominem attacks are those against the person rather than the argument

Page 76: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• Stereotyped Reasoning and Prejudice

Everybody argues using stereotypes, and mostly this is not a bad thing. Inappropriate uses of stereotypes involve the failure to adjust them in the light of new information. Such behaviour is called prejudice.

Page 77: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• For more examples (there are many!):

www.philosophicalsociety.com/Logical Fallacies.htm

www.skepticreport.com/tools/logicfallacies.htm

www.fallacyfiles.org

www.adamsmith.org/logicalfallacies

ProofsforP.pdf

Page 78: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Evidence

• analysis or proof• modelling• simulation • experiment

Page 79: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• An analysis or proof is a formal argument that the hypothesis is correct

Page 80: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• A model is a mathematical description of the hypothesis

• Usually there is a demonstration that the hypothesis and the model correspond

Page 81: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• A simulation is usually an implementation or partial implementation of a simplified form of the hypothesis, in which the difficulties of a full implementation are side stepped by omission or approximation

Page 82: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

• An experiment is a full test of the hypothesis, based on an implementation of the proposal and on real (or at least realistic) data

Page 83: Scientific Communication CITS7200 Lecture 9 Designing and Analysing Experiments.

Fair experiments

• Tests should be fair – not designed to support your hypothesis