Assignment #6

67
Assignment #6 Chapter 10: 14, 15 Chapter 11: 14, 18 Due tomorrow Nov. 6 th by 2pm in your TA’s homework box

Transcript of Assignment #6

Page 1: Assignment #6

Assignment #6

Chapter 10: 14, 15 Chapter 11: 14, 18 Due tomorrow Nov. 6th by 2pm in your TA’s homework box

Page 2: Assignment #6

Assignment #7

Chapter 12: 18, 24 Chapter 13: 28 Due next Friday Nov. 13th by 2pm in your TA’s homework box

Page 3: Assignment #6

Reading

For Today: Chapter 14 For Tuesday: Chapter 15

Page 4: Assignment #6

Lab Report

•  Posted on web-site •  Dates

–  Rough draft due to TAs homework box on Monday Nov. 16th –  Rough draft returned in your lab section the week of Nov. 23rd –  Final draft due at start of your registered lab section the week of Nov. 30th

•  10% of course grade –  Rough Draft - 5% –  Final draft - 5% –  If you’re happy with your rough draft mark, you can tell your TA to use it for

the final draft

•  Read the “Writing a Lab Report” section of your lab notebook for guidance!!

Page 5: Assignment #6

Chapter 13 Review

Page 6: Assignment #6

Assumptions of t-tests

•  Random sample(s)

•  Populations are normally distributed

•  (for 2-sample t) Populations have equal variances

Page 7: Assignment #6

Detecting deviations from normality

• Previous data/ theory

• Histograms

• Quantile plots

• Shapiro-Wilk test

Page 8: Assignment #6

Sampled from a normally distributed population

Page 9: Assignment #6

Sampled from non-normally distributed populations

Page 10: Assignment #6

Detecting deviations from normality: by quantile plot

Normal data

Page 11: Assignment #6

Detecting differences from normality: Shapiro-Wilk test

A Shapiro-Wilk test is used to test statistically whether a set of data comes from a normal distribution.

Page 12: Assignment #6

What to do when the assumptions are not true

•  If the sample sizes are large, sometimes the parametric tests work OK anyway

•  Transformations

•  Non-parametric tests

•  Randomization and resampling

Page 13: Assignment #6

Data transformations

A data transformation changes each data point by some simple mathematical formula.

Page 14: Assignment #6

Log-transformation

" Y = ln Y[ ]

Y Y' = ln[Y]

Freq

uenc

y

Page 15: Assignment #6

Other transformations Arcsine

" p = arcsin p[ ] proportions

Square-root

" Y = Y +1 2 Counts; When standard deviaiton and mean increase

together Square

" Y = Y 2 Left skwed data Reciprocal

" Y =1Y

Right skewed data

Antilog

" Y = eY Left skewed data

Page 16: Assignment #6

Non-parametric methods

•  Assume less about the underlying distributions

•  Also called "distribution-free"

•  "Parametric" methods assume a distribution or a parameter

Page 17: Assignment #6

Sign test

•  Non-parametric test •  Compares data from one sample to a

constant •  Simple: for each data point, record

whether individual is above (+) or below (-) the hypothesized constant.

•  Use a binomial test to compare result to 1/2.

Page 18: Assignment #6

The sign test has very low power

So it is quite likely to not reject a false null hypothesis.

Page 19: Assignment #6

Most non-parametric methods use RANKS

•  Rank each data point in all samples from lowest to highest

•  Lowest data point gets rank 1, next lowest gets rank 2, ...

Page 20: Assignment #6

Non-parametric test to compare 2 groups

The Mann-Whitney U test compares the central tendencies of two groups using ranks.

Page 21: Assignment #6

Performing a Mann-Whitney U test

•  First, rank all individuals from both groups together in order (for example, smallest to largest)

•  Sum the ranks for all individuals in each group --> R1 and R2

Page 22: Assignment #6

Calculating the test statistic, U

U1 = n1n2 +n1 n1+1( )2

− R1

U2 = n1n2 −U1

U1 is the number of times an individual from pop. 1 has a lower rank than an individual from pop. 2, out of all pairwise comparisons.

Page 23: Assignment #6

Mann-Whitney: Large sample approximation

For n1 and n2 both greater than 10, use

Z =2U − n1n2

n1n2 n1+ n2 +1( ) / 3

Compare this Z to the standard normal distribution

Page 24: Assignment #6

Permutation tests •  Also known as “randomization tests” •  Used for hypothesis testing on measures of

association •  Mixes the real data randomly •  Variable 1 from an individual is paired with variable 2

data from a randomly chosen individual. This is done for all individuals.

•  The estimate is made on the randomized data. •  The whole process is repeated numerous times. The

distribution of the randomized estimates is the null distribution.

Page 25: Assignment #6

Male wingless

Male winged

0 1.4 0.7 1.6 0.7 1.9 1.4 2.3 1.6 2.6 1.8 2.8 1.9 2.8 1.9 2.8 1.9 3.1 2.2 3.8 2.1 3.9 2.1 4.5

4.7

Real data: Randomized data:

Y 1 −Y 2 = −1.41Male

wingless Male

winged 0.7 2.8 2.3 1.9 1.9 2.1 1.8 1.6 3.8 0 1.4 1.4 1.9 2.2 3.9 2.1 4.7 1.6 2.6 4.5 1.9 2.8 2.8 0.7

3.1

Y 1 −Y 2 = 0.41

Page 26: Assignment #6

1000 permutations

P < 0.001

Page 27: Assignment #6

Chapter 14 Designing Experiments

Page 28: Assignment #6

Types of studies

Experimental study Researchers assign treatments to units so that differences in response can be compared.

Observational Study Researcher has no influence over which subjects receive which treatments.

Page 29: Assignment #6

Why do experimental study?

•  Random assignment of treatments minimizes influence of confounding variables

•  Confounding variables mask or distort

the causal relationship between measured variables in a study

Page 30: Assignment #6

Confounding variables

Supplemental Oxygen (Explanatory variable)

Survive Mt. Everest (Response variable)

Preparedness (Confounding variable)

Unmeasured variable that masks or distorts the causal relationship between measured variables in a study

Page 31: Assignment #6

Goals of experiments

•  Eliminate bias

•  Reduce sampling error (increase precision and power)

Page 32: Assignment #6

Precise Imprecise

Biased

Unbiased

Page 33: Assignment #6

Design features that reduce bias

•  Controls

•  Random assignment to treatments

•  Blinding

Page 34: Assignment #6

Controls

A group which is identical to the experimental treatment in all respects

aside from the treatment itself.

Page 35: Assignment #6

Uncontrolled experiment

•  Treatment applied to group of subjects and response measured.

•  We cannot determine whether the treatment is the cause of the response.

Page 36: Assignment #6

Example: placebo

•  Some illnesses, e.g. pain and depression, respond to fact of treatment, even with no pharmaceutically active ingredients

•  Control: "sugar pills"

Page 37: Assignment #6

Example: independent recovery

•  Patients tend to seek treatment when they feel very bad

•  As a result, they often visit the doctor when they are at their worst. Improvement may be inevitable, even without treatment

•  Control: untreated group to compare with, if we want to measure the effects of a new therapy

Page 38: Assignment #6

Example: Stress associated with experimental methods

•  Stressful or intrusive methods may produce a response separate from the effect of the treatment of interest

•  Control: use same methods on group that does not get treatment of interest

Page 39: Assignment #6

Randomization

The random assignment of treatments to

units in an experimental study

Breaks the association between possible confounding variables and the

explanatory variable.

Page 40: Assignment #6

Randomization

Supplemental Oxygen (Explanatory variable)

Survive Mt. Everest (Response variable)

Preparedness (Confounding variable)

?

Page 41: Assignment #6

Randomization

•  Doesn’t eliminate variation caused by confounding variable, only their correlation with treatment

•  Variation from confounding variables is spread more evenly between treatments, so they create no bias.

Page 42: Assignment #6

Randomize using a random process

•  Example: Random number generator on computer (e.g. random.org)

1.  List all subjects

2.  Assign each a random number

3.  Assign treatment A to lowest numbers and B to highest numbers.

Page 43: Assignment #6

Experiment: individuals are randomly assigned to

treatments

Page 44: Assignment #6

Examples of wrong ways to randomize

•  Treatment A to all patients at one clinic and B to all patients at second clinic

•  Assign treatments alphabetically

•  Haphazard assignment (researcher trying to be random)

Page 45: Assignment #6

Blinding

•  Preventing knowledge of patient and/or experimenter of which treatment is given to whom –  Single blind – blind patient –  Double blind – blind patient and experimenter

•  Unblinded studies usually find much larger effects (sometimes threefold higher), showing the bias that results from lack of blinding

Page 46: Assignment #6

Reducing sampling error

t =Y 1 −Y 2

sp2 1

n1+1n2

#

$ %

&

' (

Increasing the signal to noise ratio

"Signal"

"Noise"

Page 47: Assignment #6

Reducing sampling error Increasing the signal to noise ratio

If the "noise" is smaller, it is easier to detect a given "signal".

Can be achieved with smaller s or larger n. €

sp2 1n1

+1n2

"

# $

%

& ' .

Page 48: Assignment #6

Design features that reduce the effects of sampling error

•  Replication

•  Balance

•  Blocking

•  Extreme treatments

Page 49: Assignment #6

Replication

The application of every treatment to

multiple, independent experimental units

Page 50: Assignment #6

Replication

Page 51: Assignment #6

Replication

SEY1−Y2

= sp2 1n1+1n2

"

#$

%

&' Larger n reduces

sampling error

Page 52: Assignment #6

What are experimental units? •  Units that are randomly sampled and assigned

treatments –  Single individuals –  Batches of individuals that are more similar to

each other than to other batches (e.g. family)

•  Pseudoreplication (using more experimental units than you actually have) causes underestimation of standard errors and P-values

Page 53: Assignment #6

Balance

In a balanced experimental design, all

treatments have equal sample size.

Page 54: Assignment #6

Balance increases precision

SEY 1 − Y 2

= sp2 1

n1+1n2

#

$ %

&

' ( .

For a given total sample size (n1+n2), the standard error is smallest when n1=n2.

Page 55: Assignment #6

Balance increases precision

n1+n2=20

n1 =10n2 =101n1+1n2= 0.2

n1 =19n2 =11n1+1n2=1.05

Page 56: Assignment #6

Blocking

The grouping of experimental units that have similar properties.

Within each block, treatments are

randomly assigned to experimental units.

Page 57: Assignment #6

Blocking accounts for extraneous variation

C = Control T = Treated

Variance among hospitals will not contribute to SE. Only variance within hospitals will contribute to "noise"

Page 58: Assignment #6

Paired design is an example of blocking

Treatment effects are measured by

differences between treatments within pairs. This minimizes the influence of

differences between pairs.

Page 59: Assignment #6

Randomized block design

Like a paired design but for more than two treatments.

Page 60: Assignment #6

Extreme Treatments

•  Treatment effects are easiest to detect when they are large.

•  Stronger treatments can increase the signal-to-noise ratio.

•  Caution: effects may not scale linearly

Page 61: Assignment #6

Experiments with more than one factor

A factor is a single treatment variable whose effects are of interest to the researcher Multiple factors to: •  Make more efficient use of money and

resources •  Estimate effects of interaction between

factors

Page 62: Assignment #6

Interaction between explanatory variables

The effect of one variable depends on the

state of a second variable

Page 63: Assignment #6

Factorial Design

•  Investigates all treatment combinations of two or more variables.

•  Can measure interactions between treatments

Page 64: Assignment #6

Example of factorial design and interaction

Page 65: Assignment #6

What if we can’t do experimental studies?

Observational studies are still useful to detect patterns and generate hypotheses

Best observational studies Minimize bias: •  Controls •  Randomization •  Blinding Minimize sampling error: •  Replication •  Balance •  Blocking •  Extreme treatments

Page 66: Assignment #6

Matching Every individual in the treatment group is paired with a control individual having the same or very similar values for the suspected confounding variables Does not account for all confounding variables (like randomization does), but only those used to match participants.

Page 67: Assignment #6

In-class Exercise Do people use more paper when they know it will be recycled? •  People given paper and told to test scissors. •  Recycling bin wither present or not No recycling bin: 4,4,4,4,4,4,4,5,8,9,9,9,9,12,12,13,14,14,14,14,15,23 Recycling bin: 4,5,8,8,8,9,9,9,12,14,14,15,16,19,23,28,40,43,129,130 1.  Make histograms and identify options for test 2.  Choose an test that you can do in class and conduct it