Classroom Assessment Principles to Support Teaching and ...

19
1 Classroom Assessment Principles to Support Teaching and Learning Appendices FEBRUARY 2020

Transcript of Classroom Assessment Principles to Support Teaching and ...

1

Classroom Assessment Principles to Support Teaching and Learning

Appendices

FEBRUARY 2020

2

Appendix A

Reviewing and Using Feedback from Conference Participants

An initial draft of these classroom assessment principles was developed in preparation for the NCME Special Conference on Classroom Assessment held at the University of Colorado Boulder, September 18-19, 2019. That draft was then revised in response to feedback received from conference participants. This attachment to the principles document highlights feedback received from conference participants and explains how that feedback was used to revise the principles document. Detailed participant comments and suggestions are provided in Attachments B-D.

Following the 2019 NCME Special Conference on Classroom Assessment, all 288 partipants were sent a survey to request feedback on the classroom principles and supportive statements captured in this document. The survey requested participants to respond to 5-point Likert items indicating whether they Strongly Agreed, Agreed, Disagreed, or Strongly Disagreed with each of the Classroom Assessment Principles and the “supportive statements” that were developed to indicate how various entities and organizations can support this classroom assessment vision. A total of 92 individuals or 32 percent of the conference participants responded to the survey. In addition to the Likert items, we requested optional open-ended responses from those who either disagreed or wanted to amend the classroom principles or the supportive statements. We also encouraged respondents to submit other examples or ideas of how this classroom assessment vision can be supported.

Overall, the survey yielded very high levels of agreement. Out of the 92 respondents, only nine individuals disagreed with one or more of the classroom assessment principles. When combining the two positive response categories for the classroom assessment principles (i.e., Strongly Agree and Agree) for each item, approximately 95 percent of all respondents endorsed each principle. Additionally, not one respondent disagreed with Principles 2 and 7. For the supportive statements, more than 90 percent of all respondents on average, agreed with the statements. Although there were a few more disagreements found with the supportive statements relative to the classrom assessment principles, there were, on average, only three respondents who disagreed with each supportive statement.

All of the comments received for the classroom assessment principles are displayed in Attachment B and disagreeements or suggested amendments for the supportive statements are located in Attachment C. Because we also solicited additional ideas on supportive statements, we provide all of the ideas and suggestions received in Attachment D.

In this section, we highlight key findings from examining responses to the classroom assessment principles followed by key findings from responses to the supportive statements.

3

Addressing Responses to Classroom Assessment Principles

We used feedback from conference participants to revise a principle if suggested amendments or disagreements met one of the following criteria: 1) the suggested amendment helped clarify the principle and did not substantively change the content; or, 2) the criticism or disagreement was shared by more than two respondents. Before adjusting classroom principles based on the open-ended comments received, we wanted to consider whether we could observe any patterns of disagreement from the Likert items that might indicate an alternative but common conception of classroom assessment. For example, if a large number of disagreements surfaced around a set of three principles, then this finding would prompt us to examine the relationship among those priciples and to determine whether the underlying concepts or ideas connecting those principles may need to be revisited or revised. To do this, we mapped the individual responses to Likert items as shown in Figure 1 to identify response patterns for all of the respondents with any disagreements.

Figure 1. Lowest overall scores for respondents

In Figure 1, persons 1-13 have the lowest overall scores based on the sum of their Likert ratings on the 11 principles. Persons 3, 9, 10, and 12 have lower scores despite not having selected a negative response because they did not endorse the highest rating as frequently as individuals located above their positions. A total of nine out of 92 respondents or approximately 10 percent of all respondents disagreed with at least one classroom principle. Only two respondents out of 92 disagreed with more than two principles. Persons 1 and 2 disagreed with three and four principles, respectively, with Principle 11 being the only principle where these two individuals shared their disagreement. As noted above, of the nine respondents that disagreed with one or more principles, more than half (five) disagreed on Principle 4 and four disagreed on Principle 11. Aside from the number of disagreements found for Principles 4 and 11, we did not see any clear patterns from the Likert items pointing to alternative conceptions of classroom assessment.

Based on the open-ended responses received from respondents who either disagreed or wished to amend a given classroom assessment principle, we found some consistencies in the input provided for Principles 1, 4, 7 and 11, which led led us to reword and clarify those principles. As noted earlier, two of those principles (4 and 11) were also flagged by the Likert responses captured in Figure 1. We subsequently revised those two principles to better qualify what is intended by the use of “authentic” tasks (Principle 4) and to frame research-based recommendations for grading practices in a more affirmative way (Principle 11). In one unique case, a participant suggested that Principle 10 (previously Principle 5) be repositioned to the current location adjacent to Principle 11. Since this amendment did not change the content of the principle and was clearly connected to Principle 11, we reordered the principles to reflect this change.

3

Addressing Responses to Classroom Assessment Principles

We used feedback from conference participants to revise a principle if suggested amendments or disagreements met one of the following criteria: 1) the suggested amendment helped clarify the principle and did not substantively change the content; or, 2) the criticism or disagreement was shared by more than two respondents. Before adjusting classroom principles based on the open-ended comments received, we wanted to consider whether we could observe any patterns of disagreement from the Likert items that might indicate an alternative but common conception of classroom assessment. For example, if a large number of disagreements surfaced around a set of three principles, then this finding would prompt us to examine the relationship among those priciples and to determine whether the underlying concepts or ideas connecting those principles may need to be revisited or revised. To do this, we mapped the individual responses to Likert items as shown in Figure 1 to identify response patterns for all of the respondents with any disagreements.

Figure 1. Lowest overall scores for respondents

In Figure 1, persons 1-13 have the lowest overall scores based on the sum of their Likert ratings on the 11 principles. Persons 3, 9, 10, and 12 have lower scores despite not having selected a negative response because they did not endorse the highest rating as frequently as individuals located above their positions. A total of nine out of 92 respondents or approximately 10 percent of all respondents disagreed with at least one classroom principle. Only two respondents out of 92 disagreed with more than two principles. Persons 1 and 2 disagreed with three and four principles, respectively, with Principle 11 being the only principle where these two individuals shared their disagreement. As noted above, of the nine respondents that disagreed with one or more principles, more than half (five) disagreed on Principle 4 and four disagreed on Principle 11. Aside from the number of disagreements found for Principles 4 and 11, we did not see any clear patterns from the Likert items pointing to alternative conceptions of classroom assessment.

Based on the open-ended responses received from respondents who either disagreed or wished to amend a given classroom assessment principle, we found some consistencies in the input provided for Principles 1, 4, 7 and 11, which led led us to reword and clarify those principles. As noted earlier, two of those principles (4 and 11) were also flagged by the Likert responses captured in Figure 1. We subsequently revised those two principles to better qualify what is intended by the use of “authentic” tasks (Principle 4) and to frame research-based recommendations for grading practices in a more affirmative way (Principle 11). In one unique case, a participant suggested that Principle 10 (previously Principle 5) be repositioned to the current location adjacent to Principle 11. Since this amendment did not change the content of the principle and was clearly connected to Principle 11, we reordered the principles to reflect this change.

4

In addition to rewording, reordering, and clarifying principles based on feedback received, we saw two separate but emergent points raised by participants with which we disagreed and, therefore, did not use to amend the principles. First, one substantive source of disagreement was not addressed precisely because this issue represents a fundamental difference of opinion with the authors regarding the nature of classroom assessment. For example, Principles 3, 5, and 9 -- involving attention to students’ funds of knowledge, discourse-based instructional practices used to elicit and build on students’ ideas, and linguistic and graphical scaffolds – were acknowledged to be “laudable instructional practices…but have a limited nexus with classroom assessment practice.” These research-based principles known to support deep learning are essential to the vision of classroom assessment advanced in this document. Instead of requiring separate processes or instrumentation, the informal ways of connecting with home and community and inviting students to share their experiences are, in fact, assessment processes. Feedback and clarification of one’s own thinking can be embedded entirely within instructional activities designed to engage students in the thinking and reasoning practices specific to each discipline.

The second point that surfaced but was not addressed in this revised version of the principles pertained to questions raised about the research warrants to support particular claims or statements. Because the principles document is intended for a non-research audience, we chose not to add research citations to buttress certain points. Many references are already provided in the articles by Shepard, Penuel, and Pellegrino. In addition, we added the reference for the authoritative 2018 National Academies of Sciences report, How People Learn II. HPL II includes a chapter on motivation, which describes the importance and benefits of intrinsic over extrinsic motivation and the research evidence indicating what teachers can do to support a learning orientation. HPL II also includes accessible summaries of research on funds of knowledge and asset-based pedagogies and highlights the significance of discourse practices, ways of reasoning, and forms of knowledge representation as critical for access to discipline-specific deep lerning. We also added the reference for the 2017 special issue of the Journal of Educational Administration; in both the overview and in the set of articles, there is consistent evidence that popular data-driven decision making approaches can work against equity goals because of accountability pressures; this occurs despite the good intentions of carrying out this type of work in professional learning communities (PLCs).

Addressing Disagreements or Amendments to Supportive Statements

The Likert-items for the supportive statements also indicated strong overall agreement from 52 out of 92 participants who chose to respond to these statements. The supportive statements noted below reflect those that elicited more than three disagreements out of the 52 participants:

School and district leaders

• Develop or adopt district-level assessments that embody the full range of desired learning goals. (Six disagreed)

• Establish grading policies in support of grading practices aimed at establishing clear success criteria, while avoiding the use of grades as motivators. (Four disagreed)

• Develop new processes for school improvement and mandated teacher evaluations that better attend to sociocultural learning practices and commitments to equity. (Four disagreed)

5

State Department of Education

• Develop state-level assessments that embody authentic learning goals and support the development of local systems of balanced assessment. (Five disagreed)

Measurement and Content Experts

• Engage in collaborations to establish linkages between rich classroom level student work and the kinds of quantifications needed for district and state level assessments. (Four disagreed)

Teacher Preparation

• Adopt a shared vision for teacher candidate learning about classroom assessment that integrates it with other program commitments to diversity, equity, learning theory, and subject-specific instructional practices. (Four disagreed)

Given that the above six statements elicited a higher level of disagreement relative to the other statements, we looked to the comments to determine what, if any, common themes or issues could be found to help us revise or clarify the explanations provided under each statement. Similar to the approach we took with the classroom principles, we revised the explanations if the criticism or disagreement was shared by two or more respondents. As indicated by the comments received and presented in Attachment C, we found only two common areas of disagreemeent for two of the six supportive statements: one located under school and district leaders and another found under the state. These two statements also had the highest number of disagreements from respondents. Since the comments underlying the other four statements did not appear to point to a common issue or concern, we did not undertake revisions to the explanations provided for those statements.

Under the statements provided for school and district leaders, four respondents offered various criticisms of the principle pertaining to having districts develop or adopt assessments that embody the full range of learning goals. The criticisms pointed to concerns about having district tests replicate the “one size fits all” for all classrooms model subscribed by commercial products or the state tests. We modified the information under this statement to directly address this concern and to emphasize the uses of these assessments to provide programmatic insights rather than to compare and rank students, teachers and schools.

For the state department of education, three respondents noted concerns with the statement of having the state department of education provide curricular models or sample units for districts. We modified the explanation of this supportive statement to first remind readers that a typical support function for the state department of education is to provide curricular and instructional resources for school districts that request assistance. We also modfied the accompanying explanation to make it clear to readers that the penetration of these types of resources will likely be limited to those districts who actively seek out assistance from the state. However, we also note that these resources can provide helpful guidance to other districts seeking to build these curricular resources and assessment systems on their own. We modified the explanation under this statement to assure readers that we are not encouraging the state to take a top-down approach to impose curricular resources and assessments onto districts.

6

Appendix B

Disagreements or Suggested Amendments to the Classroom Assessment Principles

All of the individual comments made in response to specific principles are reproduced exhaustively below. Many of the comments were idiosyncratic. However, we have grouped together at the start of each principle section those comments that spoke to a similar concern. Those comments that resulted in revisions reflected in the final version of the principles are shown with an asterisk.

Principle 1

1. I disagree with #1. Students are not mature enough yet to completely understand all the learning goals. They can understand some, but there are some that are too far ahead of where they are. For example, a 9th grader learning Algebra 1 and asking why do I have to do this does not yet have the answer to that question because Algebra includes the building blocks for higher level mathematics.*

2. These principles are framed as focusing on the enactment of assessment by teachers, presumably within their classrooms. But, as your addenda note, no classroom is really an island. Society in general, and education systems in particular, have a lot to say about what teachers should expect of their students in terms of what they should learn - what they should know and be able to do, how much and how well, by when. “Valued goals” suggests choices should be made about priorities among all the possible things that might be learned. And “equity focused” implies that, whatever the priorities, all students should be expected, and have the opportunity, to attain them. I doubt that most teachers work in contexts in which there is real clarity about what, and to what level of understanding, quality, or capability, their own students should be expected and supported to learn. And I suspect that their work contexts will vary rather widely in whether they are seriously asked and supported to take responsibility for trying to ensure that all of their students at least make “adequate progress” toward meeting whatever goals their system claims to value. Nevertheless, whatever the context, it makes pedagogical sense to ask teachers to find a way to make clear to their students what it is hoped they will learn and to find a way to help them understand, and preferably accept as being reasonable, what it means and looks like to understand and do those things well at a level of detail appropriate and hopefully meaningful for each student’s stage of progress. Somewhere here it probably would be helpful to recognize that there is bound to be an age-related shift from shared common (and in some sense pre-requisite) goals to increasingly diverse and differential goals based on such things as interest and specialization. I suppose with luck and good system design the “teachers” would differentiate in appropriate ways as well, and the goals within such classrooms can still be shared. The elaboration asserts that for productive learning novice learners need to understand the goals and why they are important. I’m sure this is true, but it begs the hard question of how a “novice” is supposed to understand the goals and the criteria for assessing them. That presumably is the heart of what one has to learn to become expert, but it would help to acknowledge that finding ways to communicate steps and stages of the goal criteria to students, before they can meet them, in ways that nevertheless

7

motivate and help them learn is kind of the central issue. Isn’t that what people call the “learner’s paradox”? Teachers will need more than just the principle to enact it.*

3. For learning goals, perhaps it is worth also saying that the goals should be appropriate but challenging – I’m thinking of the report from the New Teacher Center which reported a shockingly low level of grade-appropriate learning goals.*

4. Include a version of the learning target cast in student-friendly terms shared with students from the beginning of the learning.*

Principle 2

1. This is an important step forward, but it should be clearer about the problem. It might be helpful to make an analytic distinction between the outcomes of learning a subject or skill (including ways of describing intermediate steps of learning that subject or skill as the learning of it may proceed if it is going well), and the processes by which that learning happens or is encouraged to happen. One can have “theories” or hypotheses about both, and while they are integrally related, they aren’t quite the same thing. They are different facets or foci entailed in fully understanding the domain of teaching (writ large) and learning. “Assessment” tends to focus only on the steps and outcomes, but instruction has to be concerned with the processes as well. It is a fatal mistake to accept that division of labor and embody it in experts on assessment and experts on curriculum. These principles are well-intentioned attempts to avoid such a mistake, but they won’t succeed without a clearer understanding of the distinction and why the division of labor is so tempting. The temptation rests on the idea that we have a “scientific” technology for “measuring” the outcomes, but our understanding of pedagogy and how to teach effectively is less developed and (even) more contested, so at best it seems proper to leave those choices to professional responsibility, creativity, and “local control.” Anyway, theory is probably too grand a word to use here. “Model” as you shift to it, is a better, more modest, alternative. What you have to do in a classroom is focus on a constrained moment or set of moments (given the range of students you have) in the learning of some particular subject or skill, while at the same time keeping in mind more general concerns about things like students’ confidence, self-worth, motivation, responsibility, other social skills, seeing the relationship between the particular subject, other subjects, and the rest of life, etc. At best what you have is a complex best guess, or to dignify it,” hypothesis,” about what might move the students’ understanding along in the specific subject, and perhaps how you might tell if that is actually happening, and what to try next if it is or isn’t. That hypothesis or set of hypotheses should be embodied in the curriculum materials, lesson plans, exercises, routines, and experiences that the system provides and/or that you have yourself developed. They may or may not fit into a larger set of hypotheses about where students should get to in that subject, and in all the other important subjects and personal characteristics and capabilities that are the core goals of schooling over the year or years you and the school system have responsibility for those students. The point of this principle is that it would be really good if all of these embodiments of hypotheses about, or models of, how students are likely to learn what you want them to learn, how to tell if they are in fact learning, and what to do next if they are, or aren’t, progressing appropriately, are designed to be coherent – consistent with and supportive of each other. And it would be good if they were based on some evidence and experience indicating that they actually work in the intended ways. It would also be important for the system to establish as a norm that it should have ways of evaluating how well the models or hypotheses were working for all of

8

its students. and an expectation that it will continually or periodically revise them as needed. As I look at the additional principles below directed at school and system leaders and at the states, I have the feeling you are seriously underestimating the “two cultures” problem we have, particularly in the U.S., and how difficult it will be to get them to play well together.

2. Include that learning target has been competently and confidently mastered by the teacher charged with teaching and assessing it.

Principle 3

1. This is important, but hardly easy. It suggests a kind of learner’s paradox for teachers, since they are asked to learn how to appreciate knowledge and wisdom that they don’t already understand and connect it with knowledge and wisdom they presumably do.

2. Principle 3 promotes laudable instructional practices in general, but has a limited nexus with classroom assessment practice.

Principle 4

1. Authenticity is all very fine, but it is an empty word absent the requisite understanding of the relationship between the authentic or “familiar” content and the learning goal. Authenticity shouldn’t just be pasted on – a veneer over the same old problems – and it should deepen and reinforce the desired learning, not distract attention from it.*

2. I think that authenticity is fraught with problems because what deemed authentic by an expert may be inauthentic for some students. Authenticity comes from what students bring to class and belongs in the instruction. Informal assessments can include authenticity but formal assessments should ensure that students take away real disciplinary knowledge that will transfer.*

3. What is an authentic, real-world task for the spelling domain, grammar domain, math facts? Also, I want assessments to be a source of hope for future success while students are learning.*

4. I question the ability of researchers and educators to know how to do this effectively. Most are too far removed from real life outside of school.

Principle 5

1. In the elaboration of this principle the notion of “deepening conceptual understanding” is kind of snuck in here – there’s no certain reason to expect that to grow out of classroom discourse. This would need qualification and elaboration to be compelling. As it stands it is doctrinaire hand-waving.

2. The reliance on students’ talking seems misguided to me or insufficiently supported by empirical evidence.

3. Principle 5 promotes laudable instructional practices in general, but has a limited nexus with classroom assessment practice.

4. Don’t limit this one to math and science. Other domains are equally important in these terms. Also, this requires a nonjudgmental, confidence building classroom environment in which students become masters of the target and the vocabulary needed to talk about it.*

9

Principle 6

1. Rather than “If the whole point of FA is to ….,” maybe say “Given that the point of FA is to …”*

2. Sounds a bit too much like learning styles research.

3. In the elaboration, the point about “partially formed” seems different from ones about multiple modes and artifacts? This principle isn’t very well digested yet.

4. This requires teachers who are confident, competent masters of the target(s) in question and the progressions leading to each.

Principle 7

1. Or provide actionable descriptive feedback focused on attributes of student work over which they have control over the progress toward success.

2. In principle #7, assessments can only provide substantive insights into student thinking (or, at least, the thinking that will help with the particular learning in question at the time) when they engage students in high-quality tasks (i.e., that are a spot-on match with intended learning outcomes, are designed to elicit the intended level or type of thinking, and so on).

3. This also needs digestion. It isn’t clear how assessments themselves “provide insights” about how to improve – they are more likely to provide evidence of where students are, and that may suggest the need to improve – perhaps identifying a gap between a current understanding and where you would want it to go or grow – but beyond a motivating recognition that there is a gap or inadequacy how clear would the character of the next level be to the student? Would the insights about how to improve come from the teacher or from a teacher’s manual associated with or provided by the test or assessment provider?

4. The other principles reinforce building with student strengths and experiences, avoiding a deficit model. However the language in #7 talks about educators identifying their shortcomings and making visible to students how they are revising their teaching. To be aligned to the idea of building on and with strengths, shouldn’t we ask educators to reflect on their practice in the same way?

Principle 8

1. While I think this is important, the practical implications as understood by most assessment experts have problematic consequences, such as avoiding formal assessments all together. Just make formal assessments discreet and minimally consequential for students.

2. My thinking on this: help students see themselves as responsible for the success they see themselves experiencing…set students up for a personal sense of continuous success on step at a time. And thank you for including this one! These emotional dynamics are absolutely crucial.

Principle 9

1. I didn’t disagree but I think “recommended for English learners” is a really, really broad bucket since there are conflicting recommended scaffolds for a very diverse group, and we didn’t say anything about other minoritized student groups (e.g., students with learning

10

disabilities or individual education plans). I felt like so many of these items captured the appropriate nuance for that issue, but this one felt very broad-brush for just one group.

2. This certainly seems like good instructional advice – not exactly clear what it says about assessment per se.

3. Allow students time and opportunity to learn with good support before holding them accountable for it.

4. Principle 9 promotes laudable instructional practices in general, but has a limited nexus with classroom assessment practice.

Principle 10

1. Principles 5 (currently 10) and 11 could either be combined or at least be listed closer together.*

2. In principle #10, “…acquaint students with the features of quality work…” you’re already talking about criteria. Add another sentence, maybe? Something about how the criteria have to be indicators of the intended learning, not about surface features of the work (e.g., grammar or neatness in certain assignments) or following directions (e.g., had 5 sources, had a cover page). That would be one way of emphasizing that the criteria need to be high-quality. Also in #10, to connect formative assessment and summative/grading, the criteria not only need to be high-quality but also packaged in scales or decision rules of some sort that lead to meaningful, useful reporting categories. So, for example, not only do the formative assessments lead to feedback on the criteria students and teachers are looking for in the work, but also when it comes to reporting what counts as “proficient” or “passing” or whatever scale is used is a valid reflection of the way those criteria have been applied during learning. A shorter way to say this might be that the grading scales need to be valid and reliable, but I wouldn’t do that in these principles because some reader’s minds will go to conceptions of reliable and valid that I don’t think work here. But still, there does need to be quality in the reporting.

3. This one is just weird. Who is doing the helping? What does it mean to “establish a relationship” and to understand the relationship as it exists now and construe it to be “productive”? To design the assessments and/or modify the feedback in some way so that they are consistent with each other? The message of the latter anticipates the former (as in, “this is going to be on the test, and this is the right answer and the right way to say it”). Or, “we are actually testing what we tried to teach you, using the same criteria for quality and correctness.” The latter seems like a good idea, and it would imply that the tasks used for summative assessment should be designed by the assessors to reflect the tasks used in instruction, rather than the other way around as the elaboration for this principle seems to suggest. This issue is central, I assume, to the reason these principles have been offered. It should not be described in this foggy, and I sense, ambivalent way. The question of “far transfer” is a complication, but if that is what you want to assess, certainly the instructional tasks should have prepared students for such leaps, and that would constitute and define the nature of “the relationship between them.”

Principle 11

1. Principle #11 (about grading practices) is the only negatively worded principle – something to avoid rather than something to do. I suggest taking some of the language from district

11

principle #16 and expanding it to describe what classroom grading should do, rather than what it shouldn’t do.*

2. 11 would be more effectively presented with positive language that points to what should be done, not what should not be done. Also, these negative impacts should have a solid research basis if they are to be included.*

3. I put 11 as disagree, because I think this principle is loaded and biased and almost impossible. For example, taking letter grades and turning them into numeric values to create a GPA is demeaning, but that practice is embedded in our system and parents, communities and schools/colleges would revolt, therefore I dont think its fair to throw back to the teacher that some of these practices are demeaning without operationalizing and delimiting those. Sure teachers should grade papers and provide feedback and not do so in a demoralizing way, but what is demoralizing? Students will tell you grades are demoralizing, teachers will tell you standardized testing is demoralizing, but then saying we should abandon these practices is beyond the scope of teachers to be able to do. I would rather focus on the positive, or make this principle more specific.*

4. 11 is stated in the negative, and has limited nexus with classroom assessment. Would work okay if worded in terms of what it promotes.*

5. Missing: any reference to or demand for the collection of quality evidence as the result of classroom assessment.

6. The elaboration seems to confound two or more things: Controlling Behavior vs Evaluating Learning? Or motivating in some way ? Wouldn’t you be likely to hear an argument that comparisons have a value - not instead of, but in addition to helping them to see how to improve? I agree that comparisons are of doubtful value and are more likely to be harmful, but stronger evidence and more nuanced assertions will probably be needed in the face of the likely responses, if you really want to change practice.

12

Appendix C

Disagreements and Suggested Amendments to the Suppportive Statements

In this section, we provide an exhaustive list of all disagreements and suggested amendments received for a given supportive statement. We present the comments by the organization or entity and the specific supportive statement addressed. Amendments were made to supportive principles based on comments flagged with an asterisk.

School and District Leaders

• Buildcollaborationsbetweenassessmentandcurriculumdepartmentstafftoinformthe design and implementation of coherent curricular activity systems in schools.

1. This is much to be hoped-for, but it seems very unlikely to work well at the local level without a lot of well-resourced design and development work from (where?) that could provide the models and tools necessary to instantiate such a “vision.” Still, I suppose some focused local efforts of this sort could be healthy and might create demand for more help from outside? I hope some experienced district leaders will be at the conference (Paul Goren at Evanston comes to mind) who can put flesh, and realism, on these bones.

• Develop or adopt district-level assessments that embody the full range of desired learning goals.

1. District assessments are not specialized to the multiple needs of the classrooms, schools, and resources available. They make a “one size fits all” single snapshot test similar to state standardized tests which are ineffective at determining what the students learn. Standardized tests do not take into account the differences between students, cultures, and roadblocks students have to overcome. Training needs to occur to educate teachers on what makes a valid assessment that fits into their circumstances.*

2. As a former principal of a language immersion school this was a perennial problem. District-mandated assessments could not be appropriately applied to my student population and our stated charter.*

3. This would be a nightmare and not everyone would agree on the “full range of desired learning goals.” This would waste teachers’ time who have to develop it and create resentments.*

4. The elaboration asserts that data level collection at the district level does require formal instrumentation, in contrast to what is required at the classroom level. I’ll be interested to hear the discussion on this claim. I don’t think it should just be taken at face value, If it is, it will almost certainly move on to all the evils you go on to warn against. *

• Establish grading policies in support of grading practices aimed at establishing clear success criteria, while avoiding the use of grades as motivators.

1. This is just the beginning of a conversation, but it is likely to be damned from the start if

13

you use the word “grading” without instant clarification to ward off the furies it will call up. It seems to be very hard to turn our minds away from “the curve” and ranking toward substantive “success” criteria - or to distinguish between “effort” and the outcome.

2. Until students are mature enough to be totally intrinsically motivated, this idea is not good. There are not very many students at my school who don’t need some part of their grade to be a motivator. This idea could only work in a fantasy world.

• Develop new processes for school improvement and mandated teacher evaluations that better attend to sociocultural learning practices and commitments to equity.

1. I don’t disagree with this principle all that much, but this has a limited nexus with classroom assessment, and represents large scale change in areas where most operational parameters are set by states rather than districts or schools.

State Department of Education

• Articulate a vision for learning and assessment that values classroom and other forms of local assessment.

1. State leaders often don’t get a say/choice about the state test. Being able to bridge this gap is really important.

2. The “local” part of this is fine as far as it goes, but in contrasting with state (and national) level assessment, and trying to suggest that there are things they may do well, you shouldn’t allow the word “comparable” to pass by without immediate qualification or at least an asterisk. Otherwise you’ve lost this war before the first battle even starts. And why would state-wide assessments be any better for evaluating the relative strengths and weaknesses of local curricula than they are for evaluating individual students?

3. When it comes to causal inferences, assessments alone can’t do the job. Their results have to be situated within a more general framework or design, one that never obtains, really, for large-scale assessment in the U.S.

• Develop state-level assessments that embody authentic learning goals and support the development of local systems of balanced assessment.

1. This is just a fantasy when it comes to either “authentic” or ambitious goals without some high level agreement on a common curriculum or at least a common curriculum “framework” that can define the terms used by instructional tasks and assessment tasks or items to reflect those goals.

• Provide model curriculum and assessment systems and/or sample curricular units that exemplify the integration of instruction and assessment in support of deep learning.

1. I disagree with the word “provide.” The state should help teachers themselves to create (with support from other experts, as needed) such models, not “provide” them. The expertise, professionalism, and creative energies of teachers should be recognized, not side stepped.*

2. Except in large and/or well-funded state assessment departments, and depending on what one has to do to qualify for state constracts, this could be a stretch and result in less high-

14

quality systems and resources than intended. In such contexts these functions are better left to districts.*

3. When specific curricular models are provided by the state, these can too easily be interpreted at the district level as cookie-cutter approaches that “work,” resulting in a lack of critical attention to local needs and superficial adoption of scripted or otherwise restrictive methods.*

4. As indicated in the comment on #19), this one is crucially important. It would be nice to go even further and suggest that some states might experiment with offering well developed curricula (and more) in some of the core subjects and try to insist that districts use them unless they can make the case that they have an equally promising, or better, alternative. Probably a recipe for one term governors and commissioners, but hey.

• Provide professional development resources for education leaders and teachers to support classroom-based assessments that are grounded in a research-based theory of learning.

1. Except in large and/or well-funded state assessment departments, and depending on what one has to do to qualify for state constracts, this could be a stretch and result in less high-quality systems and resources than intended. In such contexts these functions are better left to districts.

2. Assessments? Here it should be: assessment-processes or, preferably something like processes for gathering evidence on students’ learning progress and problems in real-time in their classrooms.

• Make sure that the state’s articulated vision for learning and assessment is consistent across state programs.

1. Doesn’t seem as big of a deal. It’s not that problematic to include this statement, but this is a low leverage strategy in many states.

Measurement and Content Experts

• Engage in collaborations to establish linkages between rich classroom level student workandthekindsofquantificationsneededfordistrictandstatelevelassessments

1. I do not fully understand what this collaboration would do, exactly. Learn to retrofit student responses from an authentic learning task into a static district or state assessment? It seems to be taking something dynamic and making it mundane instead of the other way around.

2. Rigorous quantifications are needed - really? Why? What can this possibly mean? What are these “quantities.” Given everything you seem to be trying to say in these principles shouldn’t you be talking about qualities and qualification? These representatives of the two cultures should instead try to reach a truce and then figure out how to ensure peace over the long run.

• Supportandperhapsleadeffortstoimproveclassroomassessmentliteracythatclearlyrecognizestheneedforvariedassessmentapproachesnecessaryforeffectiveclassroom assessment.

15

1. I’ve been wondering if we should rebrand “assessment literacy” to “assessment expertise.” It sounds less pejorative.

2. A good description of the peace conference and its goals.

Teacher Preparation

• Adopt a shared vision for teacher candidate learning about classroom assessment that integrates it with other program commitments to diversity, equity, learning theory, andsubject-specificinstructionalpractices.

1. I think most teacher candidate training programs would claim they are already doing this. I’m not against the idea, but I don’t think everyone will agree on your definitions and operationalizations of these terms. Good luck!

2. The first thing they should teach is the difference between selection or sorting, on the one hand, and teaching, on the other.

3. Adopting the vision for assessment is just the first step and it’s not enough on its own. A teacher education program should have evidence in its syllabi that it is preparing teachers for high-quality assessment practice.

4. I need more detailed information about this statement to select my level of agreement.

16

Appendix D

Additional Supportive Statements and Ideas

We encouraged respondents to provide us with additional supportive statements and ideas for supporting the classroom assessment vision to add to the limited number of statements that we provided. We present all of the ideas and statements we received under each of the organizations or entities addressed.

School and District Leaders

1. These statements are all fantastic and our research has suggested that these are not happening as much as they should be. These can also be supported to happen at the state level, where test vendors often run the show to the chagrin of socioculturally-minded, equity focused Curriculum and Instruction Department of Education leaders.

2. Reinforce that there can be multiple ways of knowing and that they do not always need to lead to the “right answer.” The focus should be on the learning and the expression of understanding.

3. Deepen professional development that attends to the content learning for teaching around instruction. Discuss how to interpret and use data in implementing assessment data. Support ways to run professional learning communities to analyze and act on data.

4. Create opportunities for educators, administration, and district personnel to collaborate with each other in a safe environment to develop a shared vision and a plan for supports.

5. They should adopt approaches to accountability and improvement that emphasize qualitative and cooperative observations and evaluations of practice and de-emphasize or eliminate reliance on “measurement” of outcomes. This would be facilitated by the development and adoption of curricular materials and instructional routines that help define and make “visible” the evidence of students’ progress and problems during ongoing instruction to support teachers in enacting the practices of adaptive instruction.

6. Using observation and data to inform formative assessment as much as possible, to ensure students next steps are accurately identified.

7. Establish collaborative partnerships with community, state, national, and international partners to engage students in developing solutions to real world issues.

8. Overarching and most important priority: Demand that they provide teachers with resources and opportunity (time) to develop their assessment literacy.

State Department of Education

1. Build a state team to support efforts in assessment. Construct and utilize measures to understand how assessment is being used in classrooms across the state.

2. I love the point about creating consistent and non-competing programs. Teachers are

17

overwhelmed by competing programs many of which lack any valid evidence. Perhaps you need a standard on assessment of evidence on these programs.

3. Reduce the outcome consequences for state standardized test scores. Teachers should not be moved out of their school because their test scores are low. This type of policy discriminates against teachers who work with our highest at-risk students.

4. Two way tailored professional development sessions that take into account school’s learning culture, whilst ensuring the school understands the expectations of state leaders.

5. Provide a bank of vetted resources to support research-based assessment for all learners.

6. The state needs to own the fact that their assessments are going to loom large in our decisions. This isn’t inherently detrimental, but it is a reality that needs to be accepted. So if the state assessments only measure certain aspects of student success (as is often the case with the state-mandated multiple choice exams on isolated subject areas) then those domains are going to be valued higher than others. The state needs to reflect thoughtfully on how their accountability systems function. Along similar lines, how they label districts and schools will trickle down to how we label kids. So if they value proficiency on those isolated tests as opposed to growth and as opposed to a wide array of assessments that measure a number of skills, then that’s precisely what schools will focus on when working with kids.

7. Embedded within each of the five statements, establish and share a common vision of how to provide learning opportunities and assessments that are accessible to all students, including but not limited to those with disabilities and/or who are English language learners.

8. Can the vision for classroom assessment coexist with the current state accountability structures? It is essential that states establish a vision, clear direction and the valuation of assessment as a learning process rather than a system used to sort students, teachers, schools, etc.

9. Articulate a value for learning that emphasizes deep learning as opposed to learning aimed at the state assessment. Develop a state level assessment where all students can be successful, avoiding the constant moving target.

10. At the state level, flexibility around accountability must be provided. Districts and schools should be held accountable for student achievement, but students are unique and success for one might not look the same as success for another. There have to be ways to measure and document growth effectively.

11. Facilitate those partnerships between researchers and school and district leaders.

12. Overarching and most important priority: Demand that they provide teachers with resources and opportunity (time) to develop their assessment literacy.

Measurement and Content Experts

1. Establish research to practice partnerships that focus on closing the gap between theory and practice in order to assist practitioners with the why and how of classroom assessment.

2. Address the issue in teacher preparation programs.

18

3. Advocate for sufficient and sustained professional development both during the summer and during the school year.

4. Using new approaches to data collection that measure classroom development in several ways, rather than traditional grading practices.

5. As you discuss in your document, realize that topics such as equity and social-emotional health are not separate entities for someone else to worry about, but rather essential elements for how material should be presented, digested, and assessed.

6. Provide exemplars of high-quality instruction and classroom assessment that model the principles expounded by these principles, across grades and subject areas. Ideally, and increasingly over time, these will be research-validated.

7. Conduct and make available reviews of existing classroom assessment approaches, products, etc., to help educators and administrators make informed decisions about approaches that will create meaningful changes in their classrooms.

8. Specifically, I would like to see an organization such as the NCME develop and widely advertise learning modules for assessment literacy, classroom practice, and bridging the gaps between classroom-district-state assessments. These should be created for teacher and administrator audiences.

9. The biggest challenges here lie in finding local assessment experts who are good matches, and then finding the resources to support those partnerships. Most districts have difficulty finding ways to fund such on-going partnerships.

10. Engage school and district leaders / state leaders in our endeavors and in partnerships with us (e.g., via this conference; ncme mission funding, include these community partners on NCME committees).

Teacher Preparation Programs

1. Ensure that classroom assessment for diversity, equity, learning theory, and subject-specific instructional practices are taught and practiced throughout a teacher education program. Perhaps break this out into :1) diversity/equity, 2) learning theory, and, 3) subject-oriented.

2. Build teacher candidates’ understanding of establishing coherence within a system as it relates to curriculum, instruction and assessment.

3. Model research-based classroom assessment practices in our own classrooms. Integrate assessment literacy into other coursework.

4. Teacher educators and colleges of education need to work with school districts to formulate a means to bring inservice teachers up to speed on this shared vision.

5. Seek continual professional development.

6. Their vantage point is the most important for considering how relationships impact student learning, so they should embrace any opportunity to share that message (and other takeaways from the front line).

7. Stay aware of developing theories, products, etc. for assessment systems. Ensure that teacher candidate programs include data literacy elements.

19

8. I think teacher educators need to be able to demonstrate how learning occurs for each of their students. Deep content knowledge as well as pedagogical practices are needed to create this shift. We need to honor what teachers bring to the table in terms of their experiences and observations and learn from each other.

9. Learn from one another’s area of expertise; in other words, most curriculum specialists have knowledge of specific practices in their domain. Teachers of assessment can learn more about common classroom methods, while domain experts can improve their understanding of the pros and cons of the methods they present to candidates in their domain.

10. The point made in the document about needing to connect course offerings is critical. Teacher candidates must see the connections between the different courses they are taking, much like students need to see the connections between different content areas.

11. Programs can do a much better job integrating the learning of assessment with the pedagogical curriculum so teacher candidates do not see this as a foreign and negative practice.

12. I have been teaching classroom assessment to graduate students for year and they constantly marvel over not getting this as undergrads. The content area teachers claim to cover it but the never really seem to and many of them instead express disdain for assessment.

13. Interesting that teachers/educators get only one prompt, here. They are on-the-ground implementers of things like formative assessment practices and thus, most of the previous prompts (for engage school and district leaders / state leaders) also apply to teachers/educators who should play integral roles in each (i.e., helping design assessment-guided instruction and associated professional development approaches around that linking).