Zimmerman 1997
Transcript of Zimmerman 1997
-
7/26/2019 Zimmerman 1997
1/13
Sage Publications, Inc., American Statistical Association and American Educational Research Association arecollaborating with JSTOR to digitize, preserve and extend access to Journal of Educational and Behavioral Statistics.
http://www.jstor.org
A Note on Interpretation of the Paired-Samples t TestAuthor(s): Donald W. ZimmermanSource: Journal of Educational and Behavioral Statistics, Vol. 22, No. 3 (Autumn, 1997), pp. 349-
360
Published by: andAmerican Educational Research Association American Statistical AssociationStable URL: http://www.jstor.org/stable/1165289Accessed: 25-12-2015 18:58 UTC
F R N SLinked references are available on JSTOR for this article:http://www.jstor.org/stable/1165289?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of contentin a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.For more information about JSTOR, please contact [email protected].
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/http://www.jstor.org/publisher/aerahttp://www.jstor.org/publisher/astatahttp://www.jstor.org/stable/1165289http://www.jstor.org/stable/1165289?seq=1&cid=pdf-reference#references_tab_contentshttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/1165289?seq=1&cid=pdf-reference#references_tab_contentshttp://www.jstor.org/stable/1165289http://www.jstor.org/publisher/astatahttp://www.jstor.org/publisher/aerahttp://www.jstor.org/ -
7/26/2019 Zimmerman 1997
2/13
TEACHER'S
CORNER
A
Note
on
Interpretation
of the
Paired-Samples
t Test
Donald W. Zimmerman
Carleton
University
Keywords:
correlated
samples,
difference
scores,
independent
samples,
matched
pairs,
nonindependence,pairedsamples, power,
t
test,
Type
I
error,
Type
I
error
Explanations
of
advantages
and
disadvantagesof
paired-samplesexperimental
designs
in textbooks in education and
psychology
frequently
overlook
the
change
in
the
Type
I
error
probability
which occurs when an
independent-
samples
t
test is
performed
on correlated observations.
This alteration
of
the
significance
level can
be
extreme
even
if
the
correlation
is small.
By
compari-
son,
the loss
of
power
of
the
paired-samples
t
test
on
difference
scores due
to
reduction
of degrees
of
freedom,
which
typically
is
emphasized,
is
relatively
slight.
Althoughpaired-samples
designs
are
appropriate
and
widely
used when
there is a natural correspondenceor pairing of scores, researchers have not
often
considered the
implications
of
undetectedcorrelationbetween
supposedly
independent amples
in the
absence
of
explicit pairing.
Many experimental designs
in
education,
psychology,
and
social
sciences
employ paired
or matched observations.
A
familiar
example
is
repeated
mea-
sures on
the
same
subjects
over a
period
of time.
Some
significance
tests of
location,
including
the
independent-samples
tudent
t
test
are not
appropriate
or
these
designs,
because
the
measures
usually
are correlatedrather
han
indepen-
dent.
Researchers
typically analyze
paired
data
using
the
paired-samples
t
test,
which
essentially
is a
one-sample
Studentt
test
performed
on
difference scores.
Applied
statisticians
generally
are
aware
of the
advantages
and
disadvantages
of
this test.
First,
the correlation
associated
with
pairing
or
matching
of observa-
tions reduces the
standard
error of the
difference
between
means,
so
the error
term
differs from
that of
the
independent-samples
est.
This is
apparent
rom
the
equation
2 2
2
u_ = ug +
op-
2p7o o y.
The
correlation erm reduces
the
variance
of the
difference
between means
and
increases the t
ratio.
In
the
context
of
interval
estimation,
the
reduced
standard
This
research
was
supported
by
a
Carleton
University
research
grant.
A
listing
of the
computerprogram,
written n
Turbo
BASIC,
Version 1.0
(Borland, Inc.)
can be
obtained
by
writing
to the
authorat
15078
Eagle
Place,
Surrey,
BC V3R
4W2,
Canada.
349
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
3/13
Teacher's Corner
error
results
in a
narrowerconfidence
interval.
For this
reason,
an
experimental
design
involving
paired
observations
can
more
accurately
detect differencesin
which
a
researcher
s
interested.
Similar
ogic
applies
to
within-subjects
ANOVA
as
opposed
to
independent-groups
ANOVA.
Second,
this
gain
is
partly
offset
by
a
loss of
degrees
of freedom.
The
one-sample
t
statisticbased
on n
pairs
is evaluated
at n -
1
degrees
of
freedom,
while
the
two-sample
t is
evaluated at
2n
- 2
degrees
of
freedom.
Therefore,
authors
emphasize
that the
paired-samples
est is
preferable
f the two
groups
are
highly
correlated,
while the
independent-samples
est
is the
better
choice if
they
are
uncorrelated or
only slightly
correlated. Authors
usually
do
not advise
explicit matching
or
pairing
of
subjects
in an
experimental
design
and subse-
quentuse of a paired-samples test,unless this procedureproducesa substantial
correlation.
For
example,
Kurtz
(1965)
summarized
he
thinking
of
many
inves-
tigators
as follows.
The
advantage
f
pairing
s
seen
to
depend
n the
closeness
of the
relationship
established
etween he
two
sets
of observations s a result
of
pairing.
f
a
sufficiently igh
relationship
s
established,
he reduction
f
the
variance f the
difference
more
han
ompensates
or the
degrees
f
freedom
ost
as
a resultof
pairing;
f
only
a low correlation
s
established,
he
gains
resulting
rom
reduction f the variance
f
thedifference
may
be
more
hanoffset
by
the
loss
of degrees f freedom.p.213)
More
recently,Hays
(1988)
wrote,
Such
matchingmay
be less efficient
han
he
comparison
f
unmatchedandom
groups,
unless
he factorused
n
matching
ntroduces
relatively trong
posi-
tive
relationship
etween hemeans.
Although positive elationship,
eflected
in
a
positive
ovariance
erm,
does reduce he standardrror f the
difference,
this
procedure
lso
halves he
number f
degrees
of freedom.
Dealing
with a
sample
f N
pairsgivesonlygroups
f
N
caseseach.
Thus,
f
the
factor
ntering
intothe
matching
s
onlyslightly
elevant
o
thedifferences
etween
he
groups
or is evenirrelevantosuchdifferences,
matching
s not a desirable
rocedure.
(p.
315)
And Edwards
(1979)
noted that
the
average
alueof thecovariancemustbe
sufficientlyarge
o offset
the
fact
that
for
the same
number
f
observations,
MSsT
will
have fewer
degrees
of
freedom
han
MSw
and
will
thus
require larger
alue
of
F
for
significance.
(p.
128)
See also introductory extbooksby Howell (1987, pp. 204-206), Loether and
McTavish
(1993,
p.
554),
and
Pagano
(1986,
pp.
301-304).
These
recommenda-
tions are
typical
of
many
authors,
although
the
relative
emphasis
placed
on
reductionof
the
standard
rrorand
reductionof
degrees
of freedom
varies
from
one text
to another.
The
simulations
n
the
present
study
reveal that
this
advice
must
be
qualified
and
that
pairing
sometimes
is
associated with
a
large
difference
in
the
efficiency
350
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
4/13
Teacher's
Corner
of the two
significance
tests
when the
correlation s
quite
small.
In the case
of
naturally
paired
data,
even correlations of
.10, .15,
or .20 make the
paired-
samples
t test
mandatory
n order
to
protect against
distortionof the
significance
level. These
conclusions are based
on examinationof
power
functions as well
as
degrees
of freedom and
Type
I
errors.
The
present study
also
compares
the two tests from
another
point
of view
and
focuses attention
on
an
aspect
of the
problem
which has been overlooked.
The
comparison
of the
two
procedures frequently
made in textbooks fails to
take
account of an
important
effect:
Nonindependence
of observations
depresses
both
Type
I error
probabilities
and the
power
of the test to detect differences.
In
other
words,
a
correlationbetween
samples
thatarebelieved to be
independent
compromisesnot only the efficiencybut also the validityof the significancetest.
Furthermore,
he
change
that occurs
is
quite large.
Many years ago,
Cochran
(1974),
Scheff6
(1959),
Walsh
(1947),
and
others
discovered that violation
of
the
independence
assumption
underlying
he
t
and
F
tests distorts
Type
I
and
Type
II
error
probabilities.
(See
also a
recent
study
by
Zimmerman,Williams,
&
Zumbo,
1993.)
However,
investigators
have
not
con-
sidered these results
in
the
context
of
paired-samples xperimental
designs.
The
present
note examines some
implications
of
nonindependence
of
observations,
as
investigated
in
these
studies,
for
interpretation
f the
paired-samples
statis-
tic.
Paired Data
and
Nonindependence
of
Observations
A
simulation
study
consisted
of
performing
independent-samples
Student t
tests
and
paired-samples
tests on
samples
from
a
normal
population.
Although
it
is
possible
to
calculate
the
power
of these
tests
analytically,
a
comparison
of
the two
tests is not
possible
without
taking
into
consideration he changein Type
I
error
probabilities
discussed above. In
the
present
study,
a
computeralgorithm
induced correlations
ranging
from -.50
to .50
by
adding
a
multiple
of
one
random
variable
to each
of two
other random
variables,
the
multiplicative
constant
being
chosen
to
produce
the
desired
correlation
coefficient.
The
algorithm
generated
N(0,
1)
normal
deviates
by
the
method
of Box
and
Muller
(1958),
based on the
transformation
X
=
(-2
log
Ul)1/2
cos
27rU2,
where
U1
and
U2
are
uniformly
distributed
pseudorandom
numbers on the
interval
(0, 1).
In
successive
replications,
constants
were
added to all
scores in
one
group
in incrementsof .5o, 1.25o, or 1.5u in orderto determinebothTypeI andType
II
errors.
Sample
sizes
ranged
from 10
to 80. The
study
performed
both one-
tailed and
two-tailed
tests at the
.05
significance
level. Each
data
point
repre-
sents
10,000
replications
of
the
sampling
procedure
and
subsequentsignificance
tests. The
purpose
of
the
simulations
was to
illustrate the
arguments
in
the
present
note,
and
they
were
not
intended
to be an
exhaustive
study
of
properties
of the
t
test.
351
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
5/13
Teacher's Corner
Two Concomitant Effects: Failure to Maintain the
Significance
Level and
Reduction of the Power of
the Test
First,considerthe two curvesin the lower section of Figure1, whichplots the
probability
of
rejecting
the
null
hypothesis
as a
function
of the correlation
between
paired
observations,
for both the
independent-samples
test and
the
paired-samples
test,
when
the null
hypothesis
is false. There
were
20
pairs
of
observations,
and
the
difference
between the
means of the
two
populations
was
1.5o.
It
is
apparent
that the
efficiency
of
the
paired-samples
test
increases
systematically,
while
that of the
independent-samples
est
decreases,
as the
correlation
increases
from
-.50 to
.50.
When
the correlation is
zero,
the
independent-samples
est
is
slightly
more
powerful
than the
paired-samples
est.
This resultis consistent with ourpreviousdiscussion,although investigatorsdo
not
usually
consider
negative
correlations
n
the
present
context.
Examination of
the
upper
section
of
Figure
2,
again
based on 20
pairs
of
observations,
reveals a
somewhat different
pattern.
In
the
simulations
repre-
sented
in
this
graph,
there were
no differences between
population
means,
so
that
the curves
represent
the
probabilities
of
Type
I
errors.The
paired-samples
test maintains
the
probability
close
to
the
.05
significance
level
despite
the
increasing
correlation.The
independent-samples
est, however,
exhibits
a
rather
large change
as the
correlation ncreases. Even a
correlationof
only
.10
or .20
has a substantial nfluence on this test.Because of this changein theTypeI error
probability,
the
values
plotted
for the
independent-samples
est
in
the
lower
section
of
Figure
1
cannot be
interpreted
s
the
power
of the
test.
Consequently,
the values
are
not
comparable
o those of the
paired-samples
est.
Implications
of
the
alterationof the
significance
level
are further
llustrated
by
Figure
2.
The
upper
section of
the
figure
shows
power
functions of
both tests.
In
this
graph,
here are 20
pairs
of
scores,
the
correlation
s
zero,
and
the difference
between means
increases from 0
to
4.5o
in
increments of
.5o.
Apparently,
he
independent-samples
est is
slightly
more
powerful
than the
paired-samples
est.
The difference
between the two
curves is
accounted
for
by
the
fact
that the
paired-samples
test is
based on
9
degrees
of
freedom
(critical
value
of
t
of
2.262),
while the
independent-samples
est
is
based on
18
degrees
of
freedom
(critical
value
of
t
of
2.101).
In
the
data
plotted
in
the
lower
section,
the
correlation
between
paired
observations
is
.30. In
this
case,
the
paired-samples
test
dominates the
independent-samples
test.
However,
the
Type
I
error
probability
of the
independent-samples
est
declines
to
.023,
while
that
of
the
paired-samples
est
remains
close to .05.
For this reason, the two power curves are not compa-
rable.
Similarly,
in
the
lower
section of
Figure
1,
one
cannot
conclude
that
the
independent-samples
est is
preferable
or
negative
correlations,
because
of the
large
difference
in
Type
I
error
probabilities
exhibited in
the
upper
section.
The
third curve
in
Figure
2,
labeled
adjusted,
represents
the
paired-samples
test
performed
at
the
.023
significance
level.
This
adjustment
of the
significance
level
to allow
for the
change
in
Type
I
error
probability
makes
the
two
functions
352
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
6/13
n
=
20
0.12
I
0.11
O
t-Independent
S0.10
O
t-Paired
O
0.09
0
o
0.08
0.07
4
0.06
0
0.05
-
-
0.04
QQ3
-Q
0
L
0.02
-
0.01
-
0.00
-0.5 -0.4 -0.3 -0.2
-0.1 0.0
0.1
0.2 0.3 0.4
0.5
Correlation
n
=
20
0.55
0.5o
-
t-Independent
o
0
t-Paired
I
0.45
C
0.40
S
0.35
-?-
- - -
-
-
-
-
- - - -
-
- 0.30
--------
0
0.25
S0.20
S0.15
0
L
0.10
0.05
0.00
-0.5 -0.4 -0.3
-0.2 -0.1 0.0
0.1
0.2 0.3
0.4 0.5
Correlation
FIGURE
1.
Probability
of
rejecting
Ho
by
the
independent samples
t test and the
paired-samples
t test as a
function
of
correlation
Note. The differencebetween
population
means is zero
in
the
upper
section and
1.5ar
n
the
lower
section.
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
7/13
n
=
20
p
=
0
1.0
0.9
O
t-Independent
0 t-Paired
-1-
0.8
.
0.7
.C
0.6
0.6
0.5
S0.4
c3
0.3
-0
0
L
0.2
0.1
0.0
I
I
0
1 2
3
4 5
6
7
8 9
Difference
in
Standard
Units
n
=
20
p
=
.30
1.0
0.9
0
t-Independent
o
*
t-Paired
0
0.8
V
t-Adjusted
S
0.7
0
0.6
>
0.4
.-0
0.3
O
0.2
L-
0.1
0.0
0
1 2
3 4
5 6
7
8
9
Difference in
Standard
Units
FIGURE
2.
Probability of rejecting
Ho
by
the
independent-samples
test,
the
paired-
samples
t
test,
and
the
paired-samples
t test
with an
adjusted
significance
level as a
function
of
the
difference
between
means
(increments
of
.5cr)
Note. The
correlation
s zero
in
the
upper
section
and .30
in the lower
section.
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
8/13
Teacher'sCorner
comparable.
The
modified curve
remains
slightly
below that of
the
independent-
samples
test
over the entire
range
of
differences
between means.
This
slight
disparity
of the two curves
apparently
eflects differences
in
degrees
of
freedom.
It is evident from these
figures
that even a moderate correlation between
observations has a
stronger
influence on
the
probability
of
Type
II
errors
and
power
than does
reduction
of
degrees
of
freedom
from
18
to
9.
Table
1
provides
simulation data for
sample
sizes of
10, 20,
40,
and 80
for
both one-tailed and two-tailed
tests. The
difference between
means
increased
in
incrementsof
1.25r.
It
is
evident
that
depression
of the
Type
I
error
probability
of the
independent-samples
est
occurs
consistently
for
all
sample
sizes
exam-
ined.
Furthermore,
he
relative
advantage
of the
paired-samples
est for
corre-
lated
samples
is
apparent
or
all
sample
sizes.
Conclusions
Inspection
of
Figures
1
and 2
and
Table
1
certainly
confirms the
widespread
belief
among
researchersand
applied
statisticians hat
one
should
substitute
the
paired-samples
t
test
for the
independent-samples
est whenever
subjects
are
coupled
or
matched
in
some
way
in
an
experimental
design.
The
magnitude
of
the effect
producedby
slight
correlations
probably
s
greater
han
most
research-
ers
realize. The
present
results
disclose that even
a
correlation of .10 or
.20
seriously distortsthe significance level of the t statisticbased on
independent
samples.
When
power
functions are
examined,
it
is
apparent
hat
advantages
of
the
paired-samples
est are not
negligible
for
small
correlations
and are
excep-
tional
for
correlationsas
high
as
.40 or
.50.
We now
examine the
problem
from
another
point
of view. In
making
compari-
sons
in
the
present
context,
one can
ask two
distinct
questions.
The first
question
is,
What
gain
in
efficiency
results
from
using
a
matched-pairs
experimental
design
instead of an
independent-samples
esign,
if
matching
nduces
a
correla-
tion?
The
answer
to this
question
is
found
by
comparing
he
curve
representingthe
paired-samples
test
in
the
lower
section of
Figure
1
with
the
horizontal
broken line.
The
line
represents
a
constant
probability
of
.308,
which
is the
power
of the
independent-samples
est
when
the
correlation
s
zero.
This com-
parison
makes it
clear that
the
advantage
of
the
paired-samples
design
becomes
greater
as the
correlation
ncreases from
0
to
.50,
and
that the
advantage
s
quite
large
for
higher
correlations.
This
outcome is
consistent
with the
usual
interpre-
tation
of the
two
tests. Of
course,
the
amount
of
gain
depends
on
the
parameters
chosen for
this
particular
example.
The
figure
also
reveals
that
a
negative
correlation
results
in
a
loss
rather han a gain.
A
second
question
is,
What
loss
occurs
if
one
performs
the
independent-
samples
t
test
inappropriately
n
measures
which
are
correlated?
This
question
is
somewhat
more
complicated,
but it
has
significant
practical
applications.
The
answer
can
be
found
by
inspecting
the
two
curves
(open
circles
and
filled
circles)
in
the
lower
section
of
Figure
1.
These
curves
reveal
that
the
difference
in
the
probabilities
of
rejecting
the
null
hypothesis
for
the
two
tests
becomes
355
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
9/13
TABLE
1
Probability
of rejecting
Ho
by independent-samples
test and
paired-samples
t test
for
various
numbers
of
pairs
(n)
and
degrees of
correlation
between
samples
(p)-one-tailed
and two-tailed
tests
p
=
0
p
=
.20
p
=
.40
t
t
t t t t
n
D
indep. paired
indep. paired
indep.
paired
One-tailed tests
10
0
.049
.050 .035 .051
.022 .051
1
.297
.283
.284
.331
.249 .391
2 .716 .683 .734 .766 .762 .860
3
.952
.937
.964 .968 .980
.993
20
0
.048
.047
.034
.052 .019
.051
1
.295
.285
.288
.345
.261
.417
2 .724 .711 .748 .797 .786 .891
3 .957
.951
.977 .984 .988
.997
40
0
.052
.051
.034 .051 .019 .050
1
.309
.304
.288 .350
.258
.426
2
.744
.735 .755
.807 .792 .898
3 .962 .961 .977 .986 .988 .996
80
0
.050
.050 .033
.048
.017 .051
1
.317 .314
.279
.346
.267
.435
2
.743 .736 .761 .812
.791 .899
3
.962 .958
.978
.986 .989 .997
Two-tailedtests
10
0
.051 .050 .030 .048 .016 .051
1
.199
.182 .170 .211
.149
.270
2
.583 .531 .601 .634
.609
.758
3 .899 .863 .926 .931 .946 .976
20
0
.050
.048 .031 .051 .014
.049
1
.195 .185
.175
.229
.145
.288
2
.603 .580 .622
.684 .636
.798
3
.921 .903 .941
.953 .961 .988
40
0
.051 .051 .030
.050 .013
.052
1
.218
.215 .178
.238
.143
.302
2
.630 .620
.636 .710
.652 .820
3
.930 .923
.944
.963 .966 .991
80 0 .051
.051 .030
.049 .0
13
.051
1
.215
.210
.185 .246
.149 .316
2
.627 .617
.649 .719 .672
.834
3
.926
.922 .952
.969 .972
.992
Note.
Differences re
n
units
of
1.25u.
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
10/13
Teacher's Corner
larger
as the correlation
increases.
The
paired-samples
est dominates
when
a
positive
correlationexceeds
about
.05,
and the
independent-samples
est
domi-
nates
when
the correlation
s
negative
or zero.
As mentioned
earlier,however,
the
upper
sectionof
Figure
1 discloses thatthe
differences
in
the
lower section
cannot be
interpreted
s differences in the
power
of the
test,
because
the
Type
I
error
probability
changes
as the correlation
changes.
The two sections
of this
figure
together suggest
that a
correlation
spuriously
elevates
or
depresses
the entire
power
function of the
independent-
samples
test. Instead of
referring
to
power
differences,
one must state
simply
that
nonindependence
ompromises
the
validity
of
the test and makes
the
power
to
detect differences
uninterpretable. lthough
explicit
matching
s efficient
only
for
positive correlations,
his
spurious
alteration
of the
significance
level occurs
for both
positive
and
negative
correlations.
Authors
have not often asked
the second
question,
even
though
it has
practical
implications
for research.
Perhaps
an
experimenter
s
unaware
of some inciden-
tal
pairing
which induces
a correlationbetween
measures of a
dependent
vari-
able.
In
other
words,
a researcher
may
believe
samples
to
be
independent
when
in
reality they
are
correlated,
although perhaps
only
slightly.
Violation
of ran-
dom
assignment
of
subjects
to
experimental
reatments s one
possible
source
of
such a
correlation,
which can
invalidate
the
independent-samples
test. Another
source was identified and studiedby Coren and Hakstian 1990) andby Zumbo
(1996).
These
investigators
examined
designs
in
which each
subject
contributes
two scores to the data
pool-for example,
measures of two
eyes,
two
ears,
and
so
on,
in
perceptual
esearch.
Researchers ometimes
analyze
this kind of data
as
if
all measures
are
independent,
ignoring
the correlation
induced
by pairing.
This kind
of
violation
is
sometimes difficult to detect
in
otherwise
well-
designed
experiments
and
probably
occurs
more
often
in
researchstudies than is
generally
realized.
Undoubtedly,
t can
markedly
nfluence the
significance
level
and the
probability
of
rejecting
Ho.
For
this
reason,
the hazards of
inappropri-
ately using an independent-samplesest probablyare more serious than the loss
of
degrees
of freedom
resulting
from
using
a
paired-samples
est when
it
is
not
required.
Sometimes
researchers
ail
to
identify negative
correlationsor
overlook
the
fact that
negative
correlations
n
paired
data
have effects
quite
different from
positive
correlations
(see
Figure
1).
It is
apparent
rom the
equation
presented
earlier that
they
result
in
wider confidence
intervals
and
decreased
sensitivity
of
the
paired-samples
design.
A
negative
relationship
between
naturally
paired
subjects
is conceivable
in
some
practical
research
contexts. For
example, Hays
(1988,
p.
314)
suggested
that
measures of
personality
dominance of
husband-
wife
pairs
could be
negatively
correlated
f
highly
dominant
women are
paired
with men
having
low
dominance
ratings. Matching
on the
basis of
husband-wife
pairs
therefore
could elevate the
probability
of
Type
I
errorsof the
independent-
samples
test
and
at
the same time
reduce
the
power
of
the
paired-samples
est,
as
indicated
in
Figure
1. One
can envision
other
negative
relationships
of this
sort
357
This content downloaded from 128.240.233.146 on Fri, 25 Dec 2015 18:58:05 UTCAll use subject to JSTOR Terms and Conditions
http://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsp -
7/26/2019 Zimmerman 1997
11/13
Teacher's
Corner
in
education,
psychology,
and the social sciences.
Many
of these
relationships
may
be
quite
difficult
to
detect,
although
some can be avoided
easily
in
experi-
mental
designs.
The
example
in Table2 illustratessome of the
practical mplications
of these
conclusions.
Suppose
a researcherbelieves that the scores
in the two
left-hand
columns,
labeled
X
and
Y,
comprise
independentsamples.
A Student t test
fails
to
reject
Ho
at
the
.05
significance
level.
Now,
assume that there exists an
unknown
correspondence
of scores
as
indicated
in the next three
columns.
The
second
X
and
Y
columns are
permutations
f the two left-hand
columns with
the
hidden
pairing
now
displayed.
In
fact,
these scores are
computer-generated
samples
from
a
population
n
which the
correlationbetween
X
and
Y
was .10
and
the difference between population means was 4.65. The sample correlation
turned out
to be
.139.
Despite
this
relatively
small
correlation,
which
many
investigators
might
consider
insignificant,
a
paired-samples
test
now
rejects
Ho
at the
.05
significance
level.
Let us now look at
the same data from another
point
of view.
Suppose
an
experimenter
s aware
of
the
pairing
ndicated
n
the
table,
but
believes the
small
correlation
to be
unimportant
and
performs
an
independent-samples
test
in
order
to take
advantage
of more
degrees
of
freedom. The result is
failure
to
reject
Ho,
although
a
paired-samples
test would have
a
different
outcome.
If
the
existence of pairingor matchingis known,this kind of oversightis not likely to
occur
and can be
corrected
easily.
However,
it
is
impossible
to
know from
most
TABLE 2
Example
of
a
design
in which
initially
there is an
undetected
correspondenceof
values
t
indep.
t
paired
X Y Pair
X
Y
D=Y- X
25
34
1 17
45
28
32
39 2
25
35
10
43
34
3
16 27
11
16 30
4
24
34
10
34 35
5
43
46
3
25 46
6
18 30
12
17
23
7
34
29 -5
18
27
8
25
39 14
29
43
9 36
23
-13
24 45 10 34 43 9
34 28
11
32
34
2
36
29
12
29
28
-1
Note.
An
independent-samples
tudent
test
was
first
performed
ithout
onsiderationf
possible
pairing
f
scores.
Then,
pairing
was
recognized,
nda
one-sample
tudent
test
(i.e.,
a
paired-
samples
test)
was
performed
n
difference
cores.
Independent:
=
2.052,
df-=
22,
p
>
.05.
Paired:
=
2.2
10,
df
=
11,
p