MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales...
-
Upload
tobias-norris -
Category
Documents
-
view
214 -
download
1
Transcript of MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales...
![Page 1: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/1.jpg)
MD
E /
OE
AA
1
Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales
Presentation on June 19, 2005 to25th Annual CCSSO Conference on Large-Scale Assessment
By Joseph A. Martineau, PsychometricianOffice of Educational Assessment & Accountability (OEAA)Michigan Department of Education (MDE)
![Page 2: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/2.jpg)
MD
E /
OE
AA
2
Introduction
• Measurement of growth or “progress”– Growth models
• Measurement of educators’ contributions to student growth or progress– Value Added Models (VAM)
• Both require vertical scales that– Measure the “same thing” along the entire
scale– Have the same meaning along the entire
scale
![Page 3: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/3.jpg)
MD
E /
OE
AA
3
Distortions in studies of growth
• Using traditional vertical scales to measure growth can result in the following distortions:– Identification of growth trajectories with little
resemblance to true growth trajectories– Attribution of effects on growth to effects on initial
status and vice versa– Identification of false effects on initial status or growth– Failure to detect true effects on initial status or
growth– Identification of effective interventions as harmful and
vice versa
![Page 4: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/4.jpg)
MD
E /
OE
AA
4
Graphical demonstration of one kind of distortion in growth models
Grade 5 scale mostly measures differences in number sense
Grade 6 scale mostly measures differences in algebra
Panel A: Unequated scales
Number sense
Alg
ebra
Unequated grade-5 scale
Unequated grade-6 scale
![Page 5: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/5.jpg)
MD
E /
OE
AA
5
Graphical demonstration of one kind of distortion in growth models
Vertically “equated, unidimensional” scales have to bend to accommodate both the grade-5 and grade-6 content mixes
This can come out as fitting a unidimen-sional model if number sense and algebra scores are strongly correlated, but strong correlations do not alleviate distortions in measures of growth
Panel B: Vertically "equated" scale
Number sense
Alg
ebra
Vertically "equated" grade-5/6 scale
![Page 6: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/6.jpg)
MD
E /
OE
AA
6
Graphical demonstration of one kind of distortion in growth models
Any given student’s true achievement may not lie near the vertical scale, so the vertical scale may be incapable of accurately representing student achievement
Panel C: True achievement
Number sense
Alg
ebra
True grade 5 achievement
True grade 6 achievement
![Page 7: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/7.jpg)
MD
E /
OE
AA
7
Graphical demonstration of one kind of distortion in growth models
Therefore, the true multidimensional achievement of a student becomes projected onto the “unidimensional” vertical scale
Panel D: Projection onto vertical scale
Number sense
Alg
ebra
Projection of truegrade-5 achievementonto vertical scale
Projection of truegrade-6 achievementonto vertical scale
![Page 8: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/8.jpg)
MD
E /
OE
AA
8
Graphical demonstration of one kind of distortion in growth models
The nearest point on the “unidimensional” vertical scale is the most likely estimate of “unidimensional” student ability
Panel E: Estimated achievement
Number sense
Alg
ebra
Estimated grade-6achievement
Estimated grade-5achievement
![Page 9: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/9.jpg)
MD
E /
OE
AA
9
Graphical demonstration of one kind of distortion in growth models
The true measure of growth and the “unidimensional” measure of growth are remarkably different
The distortion can be overestimation of growth (as shown here) or under-estimation of growth
This can have remarkable effects on studies of growth
Panel F: True and estimated growth
Number sense
Alg
ebra
True growth
Estimated growth
![Page 10: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/10.jpg)
MD
E /
OE
AA
10
Distortions in studies of value added
• Using traditional vertical scales to measure educators contributions to student growth can result in the following distortions:– Mis-estimation of educator effectiveness simply
because educators serve students whose growth is occurring outside the range measured well by the test
– Attribution of prior educators’ effectiveness to later educators
• One promise of value added is to cease to hold educators accountable for prior experiences of students
• This distortion betrays that promise
![Page 11: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/11.jpg)
MD
E /
OE
AA
11
Graphical demonstration of one kind of distortion in value added models
Grade 5 scale mostly measures differences in number sense
Grade 6 scale mostly measures differences in algebra
Scale has to “bend” to accommodate both tests’ content
Panel A: Vertically "equated" scale
Number sense
Alg
ebra
Vertically "equated" grade-5/6 scale
![Page 12: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/12.jpg)
MD
E /
OE
AA
12
Graphical demonstration of one kind of distortion in value added models
True average statewide scores are likely to lie close to (but not on) the vertical scale
Panel B: Average statewide true scores
Number sense
Alg
ebra True grade-5 statewide
average achievement
True grade-6 statewideaverage achievement
![Page 13: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/13.jpg)
MD
E /
OE
AA
13
Graphical demonstration of one kind of distortion in value added models
Individual school (or teacher) average true scores are likely to lie farther off the vertical scale than statewide averages
Individual school (or teacher) average true scores are likely to be quite different than the statewide averages
Panel C: Average school-X true scores
Number sense
Alg
ebra
True grade-6 average achievement in school X
True grade-5 average achievement in school X
![Page 14: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/14.jpg)
MD
E /
OE
AA
14
Graphical demonstration of one kind of distortion in value added models
In this carefully chosen scenario, both the statewide averages and the average scores of a given school project onto the vertical scale at exactly the same place
Panel D: Projection onto vertical scale
Number sense
Alg
ebra
Projections onto vertical scaleaverage
Projections onto vertical scaleaverage
![Page 15: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/15.jpg)
MD
E /
OE
AA
15
Graphical demonstration of one kind of distortion in value added models
Even though statewide and school averages are very different in two dimen-sions, they are estimated to be identical on the “unidimensional” score scale.
Panel E: Estimated achievement
Number sense
Alg
ebra
Estimated grade-6 average achievement, both statewide and for school X
Estimated grade-5 average achievement, both statewide and for school X
![Page 16: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/16.jpg)
MD
E /
OE
AA
16
Graphical demonstration of one kind of distortion in value added models
The average state growth is overestimated, the average school-X growth is underestimated, such that both are equal
In a vertical-scale-based value added model, this exceptionally effective school would be identified as average
Overestimation of individual school effectiveness can also result from the distortions
Panel F: True and estimated growth
Number sense
Alg
ebra
True average school-X growth
True averagestatewide growth
Estimated growth, both statewide and for school X
![Page 17: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/17.jpg)
MD
E /
OE
AA
17
Graphical Demonstration
• Table 1 on page 13 of the document
• Interpretation– Effect size of 0.00 is equivalent to 1 part truth,
no parts distortion– Effect size of 0.25 is equivalent to 4 parts
truth, 1 part distortion– Effect size of 1.00 is equivalent to the results
of VAM being 1 part truth, 1 part distortion.– Effect size of 2.00 is equivalent to 1 part truth,
2 parts distortion
![Page 18: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/18.jpg)
MD
E /
OE
AA
18
Alternatives to TraditionalVertical Scales
• Given that using vertical scales in growth-based statistical models results in distorted outcomes, where do we go from here?
• Michigan has investigated several alternatives– Vertically Moderated Standard Setting– Domain-Referenced Measurement of Growth– Link only adjacent grades– Provided stronger out-of-level content representation
as vertical linking items• Matrix sampling• Large number of forms
• All of these are important to do, but are insufficient to resolve the distortions arising from using vertical scales in growth-based models
![Page 19: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/19.jpg)
MD
E /
OE
AA
19
Alternatives to TraditionalVertical Scales
• Michigan is investigating other alternatives– Additional testing
• Fall and Spring• More than twice per year• Eliminates summer loss/gain problem• Completely eliminates distortions!
![Page 20: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/20.jpg)
MD
E /
OE
AA
20
Alternatives to TraditionalVertical Scales
• Michigan is investigating other alternatives– Additional testing
• Fall and Spring• More than twice per year• Eliminates summer loss/gain problem• Completely eliminates distortions!
–Yeah, whatever!
![Page 21: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/21.jpg)
MD
E /
OE
AA
21
Alternatives to TraditionalVertical Scales
• Michigan is investigating other alternatives– Supplement grade-level content with
substantial quantities of out-of-level items• Items like those on lower grade-level tests• Items like those on higher grade-level tests• Could be done either by P&P or CBT• Implementing with CAT
– Would require little additional testing because out-of-level items could inform the stopping rules
– May not work with NCLB
![Page 22: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/22.jpg)
MD
E /
OE
AA
22
Alternatives to TraditionalVertical Scales
• Michigan is investigating other alternatives– Supplement grade-level content with
substantial quantities of out-of-level items• Provides for less precise estimates of growth, but
they should at least be undistorted• Administer items like those on lower and/or higher
grade-level tests• Could be done either by P&P or CBT• Implementing with CAT
– Would require little additional testing because out-of-level items could inform the stopping rules
– May not work for NCLB because of on-grade-level requirements
![Page 23: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/23.jpg)
MD
E /
OE
AA
23
Alternatives to TraditionalVertical Scales
• Michigan is investigating other alternatives– More complex psychometric models
• Without changing the administration model, the only way to address the distortions is to change the psychometric model
• The psychometric model needs to acknowledge and exploit the multidimensional complexity of item response data
• Multidimensional models can be a liability as well– Public relations (complexity of the model)– Possibility for error (complexity of the model)– Turnaround time (intensity of the analysis)
• This area is promising as well as challenging
![Page 24: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/24.jpg)
MD
E /
OE
AA
24
Conclusion
• Growth-based statistical models using vertically scaled student achievement data are much further along than they were several years ago
• Growth-based statistical models using vertically scaled student achievement data are still not robust enough to support high-stakes use
• Either the test administration model or the psychometric model needs to reflect the complexity of the intended analyses
• No existing methods have been proven to allow for high-stakes use of growth-based statistical models, including Value Added Models
![Page 25: MDE / OEAA 1 Un-distorting Measures of Growth: Alternatives to Traditional Vertical Scales Presentation on June 19, 2005 to 25 th Annual CCSSO Conference.](https://reader035.fdocuments.in/reader035/viewer/2022070413/5697bf951a28abf838c91009/html5/thumbnails/25.jpg)
MD
E /
OE
AA
25
Contact Information
Joseph Martineau, PsychometricianOffice of Educational Assessment & AccountabilityMichigan Department of EducationP.O. Box 30008Lansing, MI 48909
(517) [email protected]