Dramatically Better Assessment Systems: Advice for RTTT “Common Assessment” RFP Brian Gong...
-
Upload
brittany-alexander -
Category
Documents
-
view
213 -
download
0
Transcript of Dramatically Better Assessment Systems: Advice for RTTT “Common Assessment” RFP Brian Gong...
Dramatically Dramatically Better Assessment Better Assessment Systems: Systems: Advice for Advice for
RTTT “Common RTTT “Common Assessment” RFPAssessment” RFP
Brian GongBrian GongCenter for AssessmentCenter for Assessment
Presentation for the Input Meetings Sponsored Presentation for the Input Meetings Sponsored by the U.S. Department of Education for the by the U.S. Department of Education for the
“Common Assessment” RFP, “Race to the Top” “Common Assessment” RFP, “Race to the Top” fundingfunding
November 17, 2009 Atlanta, GANovember 17, 2009 Atlanta, GA
2Gong – USED Common Assessment RFP Input Mtg – 11/17/09
My Main PointMy Main Point The future of assessment in the United The future of assessment in the United
States will be shaped by what gets funded States will be shaped by what gets funded in this “Common Assessment” RFP.in this “Common Assessment” RFP.
USED should shape the RFP and fund it USED should shape the RFP and fund it with a longer-term view of having in place with a longer-term view of having in place dramatically better assessment systems in dramatically better assessment systems in ten years.ten years. When USED has to compromise, choose When USED has to compromise, choose
longer-term investments over short-term gainslonger-term investments over short-term gains Say very clearly what you want in the RFPSay very clearly what you want in the RFP Help foster good responses to the RFPHelp foster good responses to the RFP
3Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Personal Personal recommendationsrecommendations
Hedge bets by funding multiple Hedge bets by funding multiple ways to do multi-state common ways to do multi-state common assessment, especially high schoolassessment, especially high school
Invest in six “game changers” that Invest in six “game changers” that could make assessment could make assessment dramatically better within a dramatically better within a decade, decade, but should not be framed but should not be framed as being operationally implemented as being operationally implemented on the short time schedule (“2012”)on the short time schedule (“2012”)
Help foster good responses to the Help foster good responses to the RFP and afterRFP and after
4Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Short-term and Longer-Short-term and Longer-term Investmentsterm Investments
Common Assessment RFP should fundCommon Assessment RFP should fund For implementation by 2012, For implementation by 2012, what we what we
already know how to do already know how to do in large-scale in large-scale assessment butassessment but With new set of content standardsWith new set of content standards With groups of multiple states (difficult to do)With groups of multiple states (difficult to do)
For development through 2015, For development through 2015, what we what we do not know how to do well at scale, do not know how to do well at scale, but which has potential to lead to but which has potential to lead to dramatically better assessment dramatically better assessment systemssystems
5Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Implementing a new multi-Implementing a new multi-state summative state summative
assessment takes yearsassessment takes years2009 2010 2011 2012 2013 2014 201550
state systems, NAEP,
TIMMS, PISA,
PERLS, many LEA
systems, NRTs,
ACT/SAT, college’s
tests, etc.Award RFP(s)
(9/2009)
Test Specifi-cations; Develop
Items; Use specs,
reports, equating
design, administrati
on agreements,
etc. (2009-10)
Pilot Test
Items, promulgate high
stakes policies,
etc. (2010-
11)
First operatio
nal administra-tion & reportin
g, etc. (2011-
12)
Second operation
al administr
a-tion; first
report using
growth, etc. (2012-
13)
Fourth operatio
nal administ
ra-tion; first
graduating high
school class,
etc. (2014-
15)
Fast Implementation of RFP: 2012 (e.g., multi-state assessments with common content standards, “Peer Review” quality of things we know how to do)
And aligning curriculum, instruction, accountability, and supports takes longer.
6Gong – USED Common Assessment RFP Input Mtg – 11/17/09
RFP: Specify, Specify, RFP: Specify, Specify, SpecifySpecify
USED should specify its USED should specify its purpose, theory purpose, theory of action,of action, and and how the assessment how the assessment results will be usedresults will be used so responders know so responders know the big picturethe big picture
Specify what is wanted as an Specify what is wanted as an deliverable deliverable and the set parameters for responders’ and the set parameters for responders’ creative proposals (e.g., time schedule)creative proposals (e.g., time schedule)
Specify the Specify the meansmeans an outcome should be an outcome should be done if USED really wants a specific meansdone if USED really wants a specific means
7Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Some Model Systems for Some Model Systems for 20122012
Cross-state comparisonsCross-state comparisons Standards-based interpretationStandards-based interpretation Inform better instructionInform better instruction Rapid turn-aroundRapid turn-around Measure growthMeasure growth Measure student performance for Measure student performance for
teacher/administrator evaluationteacher/administrator evaluation
8Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Cross-state ComparisonsCross-state Comparisons (2012)(2012)
Purpose, TOA, Use: Hold students, Purpose, TOA, Use: Hold students, schools, LEAs, and states accountable to schools, LEAs, and states accountable to a common performance standard by a common performance standard by triggering sanctionstriggering sanctions
Outcome: Statistically robust reports of Outcome: Statistically robust reports of performance on common metric with no performance on common metric with no “wiggle room” – stronger than current “wiggle room” – stronger than current NAEP mapping studiesNAEP mapping studies
Means: Same content standards, same Means: Same content standards, same test specifications, same performance test specifications, same performance standards, single assessment across standards, single assessment across states, same administration procedures, states, same administration procedures, strong equating across yearsstrong equating across years
9Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Standards-based Standards-based InterpretationInterpretation (2012) (2012)
Purpose, TOA, Use: Promote equity Purpose, TOA, Use: Promote equity through holding students and schools through holding students and schools to common opportunity-to-learn to common opportunity-to-learn (content standards) and minimal (content standards) and minimal performance standardsperformance standards
Outcome: Valid reports of performance Outcome: Valid reports of performance related to the designated standardsrelated to the designated standards
Means: Aligned, grade-level only (?), Means: Aligned, grade-level only (?), matrix-sampled (?), high school (?), matrix-sampled (?), high school (?), SWD (?), ELL (?)SWD (?), ELL (?)
10Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Inform Better Inform Better Instruction Instruction (2012)(2012)
Purpose, TOA, Use: Assess more complex Purpose, TOA, Use: Assess more complex and applied learning (monitor); model and and applied learning (monitor); model and encourage instruction (drive)encourage instruction (drive)
Outcome: Incrementally better, more valid Outcome: Incrementally better, more valid and reliable measurement of higher-order, and reliable measurement of higher-order, complex student performances (?); more complex student performances (?); more widespread “good” instruction (?)widespread “good” instruction (?)
Means: Curriculum-embedded assessments Means: Curriculum-embedded assessments (e.g., standardized units, portfolios, (e.g., standardized units, portfolios, graduation projects) (?); curricula with graduation projects) (?); curricula with (local) matched assessments (?)(local) matched assessments (?)
11Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Rapid Turn-aroundRapid Turn-around (2012) (2012)
Purpose, TOA, Use: Promote improvement Purpose, TOA, Use: Promote improvement through rapid feedback to inform actionsthrough rapid feedback to inform actions
Outcome: Reports of performance useful to Outcome: Reports of performance useful to decisions and actions, in appropriate decisions and actions, in appropriate timeframe (distinguish actions that are multi-timeframe (distinguish actions that are multi-year or annual monitoring from annual rich year or annual monitoring from annual rich content analysis from shorter-term uses, down content analysis from shorter-term uses, down to course grades and student instructional to course grades and student instructional feedback)feedback)
Means: Trade-off speed for quality, cost: Means: Trade-off speed for quality, cost: greater reliance on multiple-choice/machine-greater reliance on multiple-choice/machine-scored; trade-off centralized standardization scored; trade-off centralized standardization for complex performances, local scoring; for complex performances, local scoring; ignore administration variations (e.g., missing ignore administration variations (e.g., missing students)students)
12Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Measure Growth Measure Growth (2012)(2012)
Purpose, TOA, Use: Accountability, program Purpose, TOA, Use: Accountability, program improvement, teacher accountability?improvement, teacher accountability?
Outcome: Report of student progress over Outcome: Report of student progress over time related to what is/could be/should be: time related to what is/could be/should be: grade-level standards (?), own starting point grade-level standards (?), own starting point (?), other students (?), program supports (?), other students (?), program supports (?), “teacher’s contribution” (?); how to use (?), “teacher’s contribution” (?); how to use in accountability (?)in accountability (?)
Means: Out-of-level testing (?), adaptive Means: Out-of-level testing (?), adaptive testing (?), vertical testing (?), vertical [moderated] [moderated] scales (?), scales (?), use math to predict reading for greater use math to predict reading for greater reliability (?), pre- post-measures within reliability (?), pre- post-measures within year (?)year (?)
13Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Teacher/administrator Teacher/administrator evaluation evaluation (2012)(2012)
Purpose, TOA, Use: Improve teacher Purpose, TOA, Use: Improve teacher quality by providing feedback (?); use in quality by providing feedback (?); use in accountability or other high-stakes accountability or other high-stakes decisions (?)decisions (?)
Outcome: Changes in student Outcome: Changes in student performance associated with (attributable performance associated with (attributable to ?) specific teachers, administrators, to ?) specific teachers, administrators, programsprograms
Means: many statistical approaches Means: many statistical approaches (check assumptions, limitations) (?); (check assumptions, limitations) (?); combine with other information (?)combine with other information (?)
14Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Personal Personal recommendationsrecommendations
Hedge bets on multiple ways to do Hedge bets on multiple ways to do multi-state common assessment, multi-state common assessment, especially high schoolespecially high school
Invest in six “game changers” that Invest in six “game changers” that could make assessment could make assessment dramatically better within a dramatically better within a decade, decade, but should not be framed but should not be framed as being operationally implemented as being operationally implemented on the short time schedule (“2012”)on the short time schedule (“2012”)
Help foster good responses to the Help foster good responses to the RFP and afterRFP and after
15Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 1Changers” - 1
Develop technology that provides Develop technology that provides evidence of more complex evidence of more complex knowledge and skills (i.e., more knowledge and skills (i.e., more valid)valid)
E.g., interactive simulations, non-E.g., interactive simulations, non-academic knowledge and skillsacademic knowledge and skills
Only use technology with an Only use technology with an evidence-centered design approach to evidence-centered design approach to maintain construct relevance, most maintain construct relevance, most studentsstudents
16Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 2Changers” - 2
Develop technology for validity
Develop complex performance Develop complex performance assessmentassessment Specify extended learning and content, Specify extended learning and content,
real application contexts, student real application contexts, student choicechoice
Develop credible (local) administration Develop credible (local) administration and scoringand scoring
Include all students (and teachers)Include all students (and teachers) Develop means of certifying validity Develop means of certifying validity
and reliability, and of combining with and reliability, and of combining with other evidenceother evidence
17Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 3Changers” - 3
Develop technology for validity Develop complex performance assessment
Develop curricula that specify Develop curricula that specify “what” and “how” of learning, and “what” and “how” of learning, and associated local assessment systemsassociated local assessment systems Interim and formative assessments are Interim and formative assessments are
needed to inform learning directlyneeded to inform learning directly Real assessment problem is informing Real assessment problem is informing
“What should be done next?” – cannot “What should be done next?” – cannot solve without curriculum and solve without curriculum and teacher/administrator expertiseteacher/administrator expertise
18Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 4Changers” - 4
Develop technology for validity Develop complex performance assessment Develop curricula, local assessment systems
Develop new measurement models Develop new measurement models and technical criteria for assessments and technical criteria for assessments of complex knowledge and skillsof complex knowledge and skills We know current models’ assumptions We know current models’ assumptions
and limitations; do not impose on and limitations; do not impose on innovations! innovations! (Example: reliability vs. validity (Example: reliability vs. validity of complex performances; cognitive vs. of complex performances; cognitive vs. unidimensional models)unidimensional models)
19Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 5Changers” - 5
Develop technology for validity Develop complex performance assessment Develop curricula, comprehensive assessment systems Develop new measurement models and technical criteria
Develop better accountability Develop better accountability models and support better use of models and support better use of assessment results for program assessment results for program improvementimprovement Assessments, assessment use, and Assessments, assessment use, and
instruction are being distorted by our instruction are being distorted by our current accountability modelcurrent accountability model
20Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 6Changers” - 6
Develop technology for validity Develop complex performance assessment Develop curricula, local assessment systems Develop new measurement models and technical criteria Develop better models of accountability and program
improvement
Develop model specifications for Develop model specifications for a coherent comprehensive a coherent comprehensive assessment system that assessment system that incorporates above fiveincorporates above five e.g., NAEP, state, e.g., NAEP, state,
21Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Invest in “Game Invest in “Game Changers” - 7Changers” - 7
Technology for validity Complex performance assessment Curricula & comprehensive
assessment systems New measurement models and New measurement models and
technical criteriatechnical criteria Better accountability models and Better accountability models and
support for program improvementsupport for program improvement
22Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Personal Personal Recommendation - 2Recommendation - 2
Invest in five assessment “game changers” Invest in five assessment “game changers” Hedge bets on multiple ways to do Hedge bets on multiple ways to do
multi-state “2012” common multi-state “2012” common assessment, especially high schoolassessment, especially high school Good current models: all MC, mixed Good current models: all MC, mixed
MC-CR, computer-based, end-of-course, MC-CR, computer-based, end-of-course, survey, etc.survey, etc.
Interwoven with state policies (e.g., high Interwoven with state policies (e.g., high school exit requirements)school exit requirements)
Help foster good responses to the RFP Help foster good responses to the RFP and afterand after
23Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Hedge bets on 2012 Hedge bets on 2012 assessmentassessment
End of course AND Grade 11 surveyEnd of course AND Grade 11 survey Computer-based AND paper & pencilComputer-based AND paper & pencil All multiple choice AND modest short CR All multiple choice AND modest short CR
AND larger amount and more extensive CRAND larger amount and more extensive CR
Fund multiple “common content standards”Fund multiple “common content standards” To find out costs and benefits of multi-state To find out costs and benefits of multi-state
common assessmentscommon assessments Because no one set of content standards is Because no one set of content standards is
clearly superiorclearly superior Because no one approach is clearly superiorBecause no one approach is clearly superior Because reporting on a common score metric is Because reporting on a common score metric is
less importantless important
24Gong – USED Common Assessment RFP Input Mtg – 11/17/09
RFP Portfolio of AwardsRFP Portfolio of Awards
Multiple (around 8) strong models Multiple (around 8) strong models that represent advances that can be that represent advances that can be implemented strongly by 2012 and implemented strongly by 2012 and that help get to the longer-term that help get to the longer-term goalgoal Consider strategy: Do not fund strong Consider strategy: Do not fund strong
models that will be adopted even if not models that will be adopted even if not fundedfunded
Multiple (perhaps 12) strong “game Multiple (perhaps 12) strong “game changer” awardschanger” awards
25Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Personal Personal Recommendation - 3Recommendation - 3
Invest in four assessment “game changers”Invest in four assessment “game changers” Hedge bets on multiple ways to do multi-state Hedge bets on multiple ways to do multi-state
common assessment, especially high schoolcommon assessment, especially high school
Help foster good responses to Help foster good responses to the RFP and afterthe RFP and after If USED wants certain outcomes If USED wants certain outcomes
of states working together, then of states working together, then promote leadership to make that promote leadership to make that happen among states, NGOs, test happen among states, NGOs, test vendors, etc.vendors, etc.
26Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Fostering Strong RFP Fostering Strong RFP ResponsesResponses
Provide clear RFP specs and different awards Provide clear RFP specs and different awards for “2012 implementation” and “game for “2012 implementation” and “game changers”changers”
If USED wants states to have vendor partners If USED wants states to have vendor partners in their RFP responses, need to indicate that in their RFP responses, need to indicate that early and facilitate it well (vs. states’ issuing an early and facilitate it well (vs. states’ issuing an RFP)RFP)
USED should think about what states who USED should think about what states who don’t get RTTT common assessment funds will don’t get RTTT common assessment funds will dodo
USED should think how what it funds will be USED should think how what it funds will be adopted after RTTT and how that will shape adopted after RTTT and how that will shape what is available in the futurewhat is available in the future
27Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Envision Intended & Envision Intended & Unintended ConsequencesUnintended Consequences
What if in 2012 there were five widely used What if in 2012 there were five widely used assessments, all aligned to the same assessments, all aligned to the same common content standardscommon content standards Four were commercially available from current Four were commercially available from current
test publishers (like the Achieve/Pearson test publishers (like the Achieve/Pearson Algebra 2 end-of-course exam)Algebra 2 end-of-course exam)
One was available by joining a consortium (like One was available by joining a consortium (like the WIDA ELP exams)the WIDA ELP exams)
States were purchasing elementary math from States were purchasing elementary math from one vendor and high school English from one vendor and high school English from another vendoranother vendor
What if there were only one assessment What if there were only one assessment being used? What if there were 46?being used? What if there were 46?
28Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Envision Intended & Envision Intended & Unintended Consequences – Unintended Consequences –
22 What if in 2012 each commercially available What if in 2012 each commercially available assessment came in five versions:assessment came in five versions: An all multiple-choice, computer-administered short form An all multiple-choice, computer-administered short form
that took 20 minutes and cost $3/per studentthat took 20 minutes and cost $3/per student An all multiple-choice, computer or paper & pencil form An all multiple-choice, computer or paper & pencil form
that took 50 minutes and cost $7/per studentthat took 50 minutes and cost $7/per student A computer or p & p version that took 120 minutes, had A computer or p & p version that took 120 minutes, had
40 multiple choice, 8 short constructed response, and 4 40 multiple choice, 8 short constructed response, and 4 extended constructed response items and cost $15/per extended constructed response items and cost $15/per studentstudent
A computer of p & p version that took 150 minutes, had A computer of p & p version that took 150 minutes, had 40 multiple choice, 4 extended constructed response, 40 multiple choice, 4 extended constructed response, and 2 long constructed response items and cost $60/per and 2 long constructed response items and cost $60/per studentstudent
A version that included a standardized test like option 3 A version that included a standardized test like option 3 and had a curriculum-embedded project and other and had a curriculum-embedded project and other performance evidence that was centrally audited and performance evidence that was centrally audited and cost $200/per studentcost $200/per student
29Gong – USED Common Assessment RFP Input Mtg – 11/17/09
Center for AssessmentCenter for Assessment
www.nciea.orgwww.nciea.org
Brian GongBrian Gong
[email protected]@nciea.org
For more information:For more information: