The paired sample experiment The paired t test. Frequently one is interested in comparing the...

The paired sample experiment

The paired t test

Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.The two treatments determine two different populations

– Popn 1 cases treated with treatment 1.– Popn 2 cases treated with treatment 2

The response variable is assumed to have a normal distribution within each population differing possibly in the mean (and also possibly in the variance)

Two independent sample design

A sample of size n cases are selected from population 1 (cases receiving treatment 1) and a second sample of size m cases are selected from population 2 (cases receiving treatment 2).The data

– x1, x2, x3, …, xn from population 1.– y1, y2, y3, …, ym from population 2.

The test that is used is the t-test for two independent samples

The test statistic (if equal variances are assumed):

1 1Pooled

2 21 1

Pooled

n s m ss

The matched pair experimental design (The paired sample experiment)Prior to assigning the treatments the subjects are grouped into pairs of similar subjects.

Suppose that there are n such pairs (Total of 2n = n + n subjects or cases), The two treatments are then randomly assigned to each pair. One member of a pair will receive treatment 1, while the other receives treatment 2. The data collected is as follows:

– (x1, y1), (x2 ,y2), (x3 ,y3),, …, (xn, yn) .

xi = the response for the case in pair i that receives treatment 1.

yi = the response for the case in pair i that receives treatment 2.

Let di = yi - xi. Then

d1, d2, d3 , … , dn Is a sample from a normal distribution with mean,

d = 2 – 1 , and

variance 2 2 2 2cov ,d x y x y 2 2 2x y xy x y

2 2 2d x y xy x y

standard deviation

Note if the x and y measurements are positively correlated (this will be true if the cases in the pair are matched effectively) than d will be small.

To test H0: 1 = 2 is equivalent to testing H0: d = 0.

(we have converted the two sample problem into a single sample problem).

The test statistic is the single sample t-test on the differences

d1, d2, d3 , … , dn

namely

df = n - 1

Example

We are interested in comparing the effectiveness of two method for reducing high cholesterol

The methods

1. Use of a drug.

2. Control of diet.

The 2n = 8 subjects were paired into 4 match pairs.

In each matched pair one subject was given the drug treatment, the other subject was given the diet control treatment. Assignment of treatments was random.

The datareduction in cholesterol after 6 month period

Treatment 1 2 3 4Drug treatment 30.3 10.2 22.3 15.0Diet control Treatment 25.7 9.4 24.6 8.9

DifferencesPair

Treatment 1 2 3 4Drug treatment 30.3 10.2 22.3 15.0Diet control Treatment 25.7 9.4 24.6 8.9

di 4.6 0.8 -2.3 6.1

0 2.31.213

3.792 4d

for df = n – 1 = 3, Hence we accept H0.

2.3d 3.792ds

0.025 3.182t

Nonparametric Statistical Methods

Many statistical procedures make assumptions

The t test, z test make the assumption that the populations being sampled are normally distributed. (True for both the one sample and the two sample test).

This assumption for large sample sizes is not critical.

(Reason: The Central Limit Theorem)

The sample mean, the statistic z will have approximately a normal distribution for large sample sizes even if the population is not normal.

For small sample sizes the departure from the assumption of normality could affect the performance of a statistical procedure that assumes normality.

For testing, the probability of a type I error may not be the desired value of = 0.05 or 0.01

For confidence intervals the probability of capturing the parameter may be the desired value (95% or 99%) but a value considerably smaller

Example: Consider the z-test

For = 0.05 we reject the hypothesized value of the mean if z < -1.96 or z > 1.96

sample meanz

Suppose the population is an exponential population with parameter . ( = 1/ and = 1/)

-40 -20 0 20 40 60 80 100

Actual population

Assumed population

Suppose the population is an exponential population with parameter . ( = 1/ and = 1/)It can be shown that the sampling distribution of

is the Gamma distribution with

The distribution of is not the normal distribution with x

and x xn n

Use mgf’s

-40 -20 0 20 40 60 80 100

Sampling distribution of x

Actual distribution

Distribution assuming normality

-40 -20 0 20 40 60 80 100

Actual distribution

-40 -20 0 20 40 60 80 100

Actual distribution

n = 20

DefinitionWhen the data is generated from process (model) that is known except for finite number of unknown parameters the model is called a parametric model.

Otherwise, the model is called a non-parametric model

Statistical techniques that assume a non-parametric model are called non-parametric.

The sign test

A nonparametric test for the central location of a distribution

We want to test:

H0: median = 0

HA: median 0

against

(or against a one-sided alternative)

• The assumption will be only that the distribution of the observations is continuous.

• Note for symmetric distributions the mean and median are equal if the mean exists.

• For non-symmetric distribution, the median is probably a more appropriate measure of central location.

The Sign test:

S = the number of observations that exceed 0

Comment: If H0: median = 0 is true we would expect 50% of the observations to be above 0, and 50% of the observations to be below 0,

1. The test statistic:

50%50%

median = 0

If H 0 is true then S will have a binomial distribution with p = 0.50, n = sample size.

median

If H 0 is not true then S will still have a binomial distribution. However p will not be equal to 0.50.

0 > median

p < 0.50

median0

0 < median

p > 0.50

p = the probability that an observation is greater than 0.

0 1 2 3 4 5 6 7 8 9 10

n = 10

Summarizing: If H0 is true then S will have a binomial distribution with p = 0.50, n = sample size.

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

n = 10

The critical and acceptance region:

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0 1 2 3 4 5 6 7 8 9 10Choose the critical region so that is close to 0.05 or 0.01.e. g. If critical region is {0,1,9,10} then = .0010 + .0098 + .0098 +.0010 = .0216

n = 10

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

e. g. If critical region is {0,1,2,8,9,10} then = .0010 + .0098 +.0439+.0439+ .0098 +.0010 = .1094

0.0000

0.0500

0.1000

0.1500

0.2000

0.2500

0.3000

0 1 2 3 4 5 6 7 8 9 10

• If one can’t determine a fixed confidence region to achieve a fixed significance level , one then randomizes the choice of the critical region

• In the example with n = 10, if the critical region is {0,1,9,10} then = .0010 + .0098 + .0098 +.0010 = .0216

• If the values 2 and 8 are added to the critical region the value of increases to 0.216 + 2(.0439) = 0.0216 + 0.0878 = 0.1094

• Note 0.05 =0.0216 + 0.3235(.0878)Consider the following critical region

1. Reject H0 if the test statistic is {0,1,9,10} 2. If the test statistic is {2,8} perform a success-failure

experiment with p = P[success] = 0.3235, If the experiment is a success Reject Ho.

3. Otherwise we accept H0.

Example

Suppose that we are interested in determining if a new drug is effective in reducing cholesterol.

Hence we administer the drug to n = 10 patients with high cholesterol and measure the reduction.

The dataCholesterol

Case Initial Final Reduction

1 240 228 122 237 222 153 264 262 24 233 224 95 236 240 -46 234 237 -37 264 264 08 241 219 229 261 252 910 256 254 2

Let S = the number of negative reductions = 2

0 1 2 3 4 5 6 7 8 9 10

n = 10

If H0 is true then S will have a binomial distribution with p = 0.50, n = 10.

x p(x)

0 0.00101 0.00982 0.04393 0.11724 0.20515 0.24616 0.20517 0.11728 0.04399 0.009810 0.0010

We would expect S to be small if H0 is false.

Choosing the critical region to be {0, 1, 2} the probability of a type I error would be

= 0.0010 + 0.0098 + 0.0439 = 0.0547

Since S = 2 lies in this region, the Null hypothesis should be rejected.

Conclusion: There is a significant positive reduction ( = 0.0547) in cholesterol.

If n is large we can use the Normal approximation to the Binomial.Namely S has a Binomial distribution with p = ½ and n = sample size.Hence for large n, S has approximately a Normal distribution with

meanand

standard deviation

1 nnnpqS

Hence for large n,use as the test statistic (in place of S)

Choose the critical region for z from the Standard Normal distribution.

i.e. Reject H0 if z < -z/2 or z > z/2

two tailed ( a one tailed test can also be set up.

The paired sample experiment The paired t test. Frequently one is interested in comparing the...

Documents

Transcript of The paired sample experiment The paired t test. Frequently one is interested in comparing the...

Paired vs. 2 sample Comparing meanswhitlock/bio300/overheads/... · 2010-08-05 · 2=1043.78 n=10 t= 25!0 1043.78/10 =2.45 CAUTION! •!The number of data points in a paired t test

Comparing the Effectiveness of Nonsurgical Treatments for … · 2019. 2. 26. · 1 Comparing the Effectiveness of Nonsurgical Treatments for Lumbar Spinal Stenosis in Reducing Pain

Comparing Voting and Common Statement Treatments: A Citizen

Two-Sample Tests Samples Comparing Two Means: Paired Samples Comparing Two Proportions ... · · 2018-01-05Comparing Two Proportions Comparing Two Variances 10-2 ... 10-12 Comparing

· To minimize the risk of drug resistance when developing antimalarial treatments, two compounds with complementary properties are paired up to form a combination therapy. The rationale

S-Plus for the Analysis of Biological Data Hypotheses about frequencies and proportions 97 ... Comparing two means 99 ... 12.2 Paired comparison of means 170 .

Paired Assoicates

Paired Programming!

Comparing patient preferences for medical treatments with ...

Paired Observation.pdf

Chapter Five: Paired Samples Methods fileThe Paired Samples (Diﬀerence) t Test The paired samples t test is conducted on diﬀerence scores obtained by subtracting one paired observation

An International Journal of Research in AYUSH and Allied … · treatment was statistically analyzed using Paired student’s ‘t’–test. On comparing the t-values, better effect

ECP & cGvHD · ECP - schedule Baseline assessments 2 weekly paired treatments 14 weeks Evidence of GvHD progression Intolerance of ECP Inability to receive regular treatments STOP

Comparing treatments in the new health care environment What works and who benefits?

Comparing means Norhafizah Ab Manan. After class, you should Understand independent t test, paired t test and ANOVA Know how to calculate the t statistics.

N ATIONAL D EFORESTATION R ATES C -> Q. T YPE OF S TUDY Paired, Two-Sample, or Multi-Sample? Paired Study Comparing the amount of forest cover in each.

FORUM Areawide Models Comparing Synchronous Versus ... · FORUM Areawide Models Comparing Synchronous Versus Asynchronous Treatments for Control of Dispersing Insect Pests JOHN A.

1 1 Chapter 2: Comparing Means 2.1 One-Sample t -Test 2.2 Paired t -Test 2.3 Two-Sample t -Test.

Comparing Two Means: One-sample & Paired-sample t-tests Lesson 12.

Friedman Test-Comparing All Treatments With a Control