State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in...

24
State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal Policy and Innovations Date: Monday, September 8, 2008 Presenter: Lauress L. Wise
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in...

Page 1: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

State Assessments Today:What State Are We In?

Presentation to:

Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal Policy and Innovations

Date:Monday, September 8, 2008

Presenter:Lauress L. Wise

Page 2: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 2

State Assessments Today

State assessments have changed dramatically over the past decade:

– Increased use for school/district accountability required under No Child Left Behind

– Greater use for graduation/promotion decisions

– Greater attention to achievement gaps through subgroup reporting requirements

There have been notable improvements, but also many remaining challenges.

– This talk will cover both recent improvements and remaining challenges in assessing progress and gaps in educational achievement

Page 3: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 3

Perspective

It has been exciting to have a front row seat:– As a recovering test developer (ASVAB)

– As a technical advisor for several states

– As an evaluator of the impact of a graduation test

– Working for a company that does independent checks of test contractor work in many states

– As a participant in updating Standards for Educational and Psychological Testing

Page 4: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 4

Improvement: (1) Public Discussion of Test Content Specifications

• Previously, test developers decided what to test in the privacy of their own offices, usually based on review of text books

• Today, policy-makers in each state must first agree on grade and subject specific expectations for student achievement– Tests are developed to assess the degree to which students

meet these expectations– To meet peer review, states and the developers must

demonstrate alignment of test content to these public content standards

Page 5: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 5

Agreement on What to Teach Has to be a Significant Step Forward

You've got to be very careful if you don't know where you're going, because you might not get there.

Page 6: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 6

Improvement: (2) Discussion and Scrutiny of Test Validity Information

• The NCLB Peer Review Process requires detailed and timely information on:– Test content and performance standards development– Test score reliability– The validity of test scores as indicators of mastery of targeted

content (alignment)

• By contrast, until recently, the availability of technical documentation on NAEP assessments was 7 years in arrears

Page 7: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 7

Improvement: (3) Reporting and Use of Assessment Results

• Greater visibility of results– School, district, and state results generally on the web within 6

months– Results for individual students reported back to teachers,

parents, and the students themselves.

• Standards-based reporting– Percent of students meeting expectations– Not just an abstract number or comparison to other students

(percentiles)– Supports policy review of whether overall student achievement is

good enough

Page 8: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 8

Improvement: (4) Attention to Achievement Gaps

• State and national assessments results now reported separately for key demographic groups

• States, districts, and schools are accountable for eliminating gaps in percentages of students judged to be proficient

• NAEP has shown decreases in achievement gaps, particularly in mathematics

Page 9: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 9

Progress in Reducing Achievement Gaps

National Achievement Gap Trends(NAEP Grade 4 Results)

05

10152025303540

2000 2003 2005 2007

Assessment Year

NA

EP

Me

an

Sc

ore

D

iffe

ren

ce

White-Hispanic:Reading

White-Hispanic:Math

White-Black:Reading

White-Black:Math

Page 10: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 10

Challenge (1): Controlling Cost and Quality (So many tests, so little time)

• Each state’s assessments must be “custom-made” to match its unique content standards– High test development costs difficult for many states– Threats to test quality as state and contractor personnel are

stretched thin across many different tests

• Do we really need 49 different tests of 5th grade mathematics achievement?– It would be 51, but RI, NH, and VT have agreed to common

standards and a common assessment

• State differences in expectations for student achievement raise equity as well as cost and quality questions.

Page 11: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 11

Percent Proficient on State Assessment is Linked to Where the Proficiency Cut is Set

Source: NCES 2007-482

State Proficiency Cut Scores: Grade 4 Reading

0%10%20%30%40%50%60%70%80%90%

100%

160 170 180 190 200 210 220 230 240

State Profiency Cut-Offs Mapped onto the NAEP Scale

Pe

rce

nt

Pro

fic

ien

t (2

00

5)

NAEP Basic

NAEP Proficient

MAWY

TNMS

SC

NAEP Results: MS – 18% proficient; MA - 44% proficient

Page 12: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 12

Challenge: (2) Improving the Rationale for Content and Performance Standards

• Decisions on the content and level of achievement that define proficiency are not really criterion referenced– May involve business community as well as teachers and

curriculum experts, but they are rarely asked to provide a rationale, let alone any evidence for proposed requirements

– Some states align standards across grades, but without any empirical evidence of the relationship of mastery at one grade to readiness for content at the next grade

• Lack of vertical alignment also complicates measuring growth

– Current NAEP efforts to define “readiness” for 12th grade students are moving slowly and are not on the radar of most states

• Framing student achievement expectations around readiness for college, work, and the rest of our lives

Page 13: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 13

Rationale for Content and Performance Standards (Continued)

• Decisions on the content and level of achievement that define proficiency are not really normative either– Limited consideration of what other states require – Almost no consideration of what other countries require

• Without a rationale for prioritizing areas of knowledge and skill there is a tendency to just throw everything in– Resulting in mile-wide, inch-deep standards and curriculum– Making it difficult to assess the entire domain with a modest

length test

• Are we measuring achievement gaps in the right things?

Page 14: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 14

Challenge: (3) Measuring Skill as well as Knowledge

• Distinction between knowledge and skill is real, but not always clear– Facts (definition of mathematical terms) might be learned in a single

session, but skills (e.g., problem solving) must be repeatedly practiced

• Knowledge is reasonably assessed through multiple-choice questions, but what about skill?– Example: Would a multiple-choice test be acceptable for awarding

medals in gymnastics?

Page 15: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 15

Measuring Skill as Well As Knowledge

• General agreement that a writing sample is needed to assess writing skills– Albeit generally limited to short, first-draft writing tasks– Holistic scoring provides limited feedback on student

skills• Content standards for other subjects also involve

skills– Reading involves interpretation and evaluation skills – Mathematics generally involves reasoning and

problem solving skills– But assessment of these subjects is generally limited

to multiple-choice questions– And there is no clear measure of the difficulty of tasks

examinees are asked to perform

Page 16: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 16

Knowledge and Skill (Continued)

Examples of Grade 3 Math Content Standards:

1. Find the perimeter of a polygon with integer sides.

2. Identify attributes of quadrilaterals (e.g., parallel sides for the parallelogram, right angles for the rectangle, equal sides and right angles for the square).

3. Determine when and how to break a problem into simpler parts.

– Note: This last content standard is repeated for Grade 4, presumably with more complex problems.

– The skill required to master this standard may be harder to assess with multiple-choice questions.

Page 17: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 17

Challenge: (4) Diagnosing Individual Student Deficiencies

• Testing alone will not reduce achievement gaps; better diagnostic information is needed.– District level may be reliable for diagnosing relative

performance in different major content areas– But diagnosing individual student needs is difficult:

• Mastery of so many different objectives is being assessed (typically only 2 or 3 questions/objective)

• Questions are generally not built around good models of knowledge and skill deficiencies

– Information is also not timely for helping individual students

• For example, information on mastery of the 4th grade content is generally not received until the beginning of 5th grade

• The child’s teacher at that point did not teach the 4th grade content

Page 18: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 18

Tension Between Assessing Overall Mastery and Diagnosing Specific Needs

• Assessments typically focus on how many questions a student can answer correctly– The goal is assessing overall mastery of a content

domain

• Diagnosis involves analyzing the questions the student cannot answer correctly– And understanding why– Cognitive laboratory work, which could help identify

questions that are more diagnostic, is too costly for most states

Page 19: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 19

Tension Between Overall Assessment and Diagnosis: Example

1. What is the perimeter of the figure shown?

2. On a sunny day, Carlos starts at one corner of the square and walks all the way around the outside. How far did he walk?

3’

3 Feet

3 Feet

3 Feet

A. 3 Feet

B. 6 Feet

C. 9 Feet

D. 12 Feet

3 Feet

A. 3 Yards

B. 4 Yards

C. 9 Yards

D. 12 Yards

Page 20: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 20

Tension Between Assessing Overall Mastery and Diagnosing Specific Needs

• Scores on the second question may have a higher correlation with overall mastery– Covers more of the domain (beyond perimeters)

• Knowledge of what a square is• Conversion from feet to yards• May also involve ability to set aside irrelevant information

– Verbal skill required may be correlated with overall mathematics achievement (although less likely to be fair to English learners)

– Note that the second question never actually mentions perimeter

• But if the student answers incorrectly, it is more difficult to know why– Definition of perimeter, of a square, or linguistic problems?– Or some other misunderstanding

Page 21: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 21

Benchmark Classroom Assessments and Summative Assessments

• In many states, vendors are offering to fill the diagnostic void by providing “end-of-unit” classroom tests that are linked to (parts of) the state’s end-of-year assessment.– Little scrutiny of the reliability, validity, and diagnostic

value of resulting scores– Potentially valuable if valid linkages can be

established• Timely – Opportunity to go back and correct misconceptions• Focused – Smaller amount of content to cover for each test• Still need questions based on better cognitive models of the

target knowledge and skills and common misconceptions

Page 22: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 22

Challenge: (5) Reducing Achievement Gaps for Students with Disabilities

• Some progress in identifying testing accommodations that allow students with disabilities to demonstrate what they know and can do– Also application of principles of universal design to remove

irrelevant barriers

• But many are not receiving instruction in the content targeted for their grade– How can we encourage maximum participation in regular

instruction while limiting gap comparisons to students who are able to do so?

• Similar problems for students who experience difficulty learning English– Instruction in other subjects varies widely for these students

Page 23: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 23

Summary

• Today’s state assessments are much improved:– Explicit content targets and policy-based

judgments for performance levels

– Greater attention to reliability and validity (alignment) concerns

– Public access to assessment results

– Attention to achievement gaps

Page 24: State Assessments Today: What State Are We In? Presentation to: Conference on Educational Testing in America: State Assessment, Achievement Gaps, Federal.

September 8, 2008 State Assessments Today 24

Summary (Continued)

• But we face many continuing challenges, including:– The cost and confusion of having so many different

tests

– Improving the rationale for content and proficiency standards

– Understanding and measuring skills as well as knowledge

– Integrating classroom and state-wide assessments to support better diagnostics of individual student learning needs

– Reducing instruction gaps, particularly for students with disabilities and English learners