A Balancing Act: Common Items Nonequivalent Groups (CING) Equating Item Selection Tia Sukin Jennifer...
-
Upload
christiana-simpson -
Category
Documents
-
view
215 -
download
0
Transcript of A Balancing Act: Common Items Nonequivalent Groups (CING) Equating Item Selection Tia Sukin Jennifer...
A Balancing Act:Common Items Nonequivalent Groups (CING) Equating Item Selection
Tia Sukin
Jennifer Dunn
Wonsuk Kim
Robert Keller
July 24, 2009
Background
Equating using a CING design requires the creation of an anchor set
Angoff (1968) developed guidelines for developing the anchor set Length: 20% of operational test (OT) or 20 items Content: Proportionate to OT by strand Statistical Properties: Same mean / S.D. Contextual Effects: Same locations, formats, key, etc.
Background
Majority of the research provides support for these guidelines (e.g., Vale et al., 1981; Klein & Jarjoura, 1985; Kingston & Dorans, 1984)
Research has included robustness studies (e.g., Wingersky & Lord, 1984; Beguin, 2002; Sinharay & Holland, 2007)
Background
Most research has used placement (e.g., AP), admissions (e.g., SAT), and military (e.g., ASVAB) exams for empirical and informed simulation studies
Research using statewide accountability exams is limited (e.g., Haertel, 2004; Michaelides & Haertel, 2004)
Background
General Science tests are administered in all states for all grade levels except: 19 states offer EOC Science exams in H.S. 10 offer more than one EOC Science exam 5 offer more than two
Research Questions
Do the long-established guidelines for maintaining content representation (i.e., proportion by number) hold in creating an anchor set across all major subject areas (i.e., Mathematics, Reading, Science)?
Are there significant changes between expected raw scores and proficiency classification when different methods for maintaining content representation are used?
Design
3 Subjects
(2 States, 3 Grades) Math Reading Science
5 Methods of Anchor Set ConstructionOperationalProportion by Number of
Items/StrandG Theory ICCsConstruct
Underrepresentation
Variance Calculation – G Theory
Multivariate Design p x i with content strand as a fixed facet
Multivariate Benefit Covariance components are calculated for every pair of
strands
Item Variance Component
'
'
' 1)( vv
p
ppvpv
p
pvv XX
n
XX
n
npS
pn
piMSiMSi
)()()(
2^
Variance Calculation – ICC
Use the median P(θ) as the average in calculating within strand variability
P(θ)
θ
22^
)(1
1)( XXn
i
Equating Item Selection
6980.515*301.
120.
nn
vv
vv
*
Example:
Equating Item Selection
Percentage of strands that differ by more than one item between selection methods (excluding the
construct underrepresentation method): Math: 13% Reading: 52% Science: 20%
Example Results – Scoring Category DistributionsAnchor Method Below Basic Basic Proficient Advanced
MATH Operational 0.15 0.24 0.34 0.27
Proportional 0.17 0.24 0.36 0.24
ICC 0.17 0.24 0.36 0.24
G-Theory 0.17 0.24 0.36 0.24
Strand 0.17 0.24 0.36 0.24
READING Operational 0.05 0.34 0.46 0.15
Proportional 0.06 0.38 0.41 0.15
ICC 0.06 0.38 0.45 0.11
G-Theory 0.06 0.38 0.45 0.11
Strand 0.05 0.34 0.46 0.15
SCIENCE Operational 0.24 0.34 0.24 0.18
Proportional 0.26 0.34 0.24 0.16
ICC 0.26 0.34 0.24 0.16
G-Theory 0.26 0.34 0.24 0.16
Strand 0.24 0.34 0.24 0.18
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 04
DC_icc: 0.97DC_g-theory: 0.97DC_strand: 0.96
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 1DC_g-theory: 1DC_strand: 1
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 0.96DC_g-theory: 0.98DC_strand: 0.97
State_A MAT 2008-2009State_A MAT 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 05
DC_icc: 0.99DC_g-theory: 0.99DC_strand: 0.99
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 1DC_g-theory: 1DC_strand: 0.95
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 1DC_g-theory: 1DC_strand: 1
State_B MAT 2008-2009State_B MAT 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 04
DC_icc: 0.99DC_g-theory: 1DC_strand: 0.89
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 0.95DC_g-theory: 0.95DC_strand: 0.97
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 1DC_g-theory: 1DC_strand: 1
State_A REA 2008-2009State_A REA 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 05
DC_icc: 0.96DC_g-theory: 0.96
DC_strand: 1
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 0.96DC_g-theory: 0.96DC_strand: 0.94
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 1DC_g-theory: 1DC_strand: 1
State_B REA 2008-2009State_B REA 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 04
DC_icc: 0.99DC_g-theory: 0.99
DC_strand: 1
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 0.94DC_g-theory: 0.94DC_strand: 0.96
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 1DC_g-theory: 1DC_strand: 0.92
State_A SCI 2008-2009State_A SCI 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 05
DC_icc: 1DC_g-theory: 1DC_strand: 1
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 08
DC_icc: 1DC_g-theory: 1DC_strand: 1
-4 -3 -2 -1 0 1 2 3 4
-2-1
01
2
Grade 10
DC_icc: 0.94DC_g-theory: 0.94DC_strand: 0.94
State_B SCI 2008-2009State_B SCI 2008-2009
E(R
aw S
core
) R
esid
ual
IccG-theoryStrand
DiscussionEquating is highly robust to the selection process used
for creating anchor sets EXCEPT Choosing equating items from 1-2 strands is discouraged More caution may be needed with Science Item selection mattered for 22% of the conditions
2/18 for Math: Both were the under rep. condition 3/18 for Reading: All were the under rep. condition 7/18 for Science: 2 under rep. / 5 ICC and G
Content balance is important and can be conceptualized in different ways without impacting the equating
Future Study
A simulation study is needed so that raw score and proficiency categorizations using the different item selection methods can be compared to truth
Meta-analysis detailing published & unpublished studies that provide evidence for or against the robustness of CING equating designs
Thank you