When Increasing Stakes Need Not Mean Increasing...

29
Theory and Kaearch in Social EduculUin Fall 2«12, Volume 30, Number 4, pp.488-515 O College und University FjLuliy Askcmbly of National Council for the Social Studies When Increasing Stakes Need Not Mean Increasing Standards: The Case of the New York State Global History and Geography Exam S, G. Grant Alison Derme-lnsinna University at Buffalo North Tonawanda Schools fill M. Gradivell Lynn Pullano University at Buffalo Catholic Diocese Schools Ann Marie Lauriceila Kathryn Tzetzo University at Buffalo Williamsville Schools Abstract In New York, slate-level policymakers have invested considerable political and economic capital in new tests as both a measure of accountability and as a vehicle for increased educational standards, ¡n this study, we look at how 9'" and 10"' grade global history teachers are making sense of the first administration of a nezu W' grade global histori/ exam. Building on prior work, we question the relationship between neiv tests and higher standards. We argue that the teachers in our sample believe the new exam is a poor measure of learning, that they have made few changes in their teaching beyond adding on a layer of test preparation, and that thei/ see a disconnect between the new tests and higher standards. We conclude that, white state policymakers may have raised the stakes, they have failed to raise educationai standards. With the recently enacted "No Child Left Behind" legislation, testing will play an increasingly large role in determining children's educations and their educational futures. Despite a curious lack of evidence {Camilli, Cizek, & Lugg, 2001; Stake & Rugg, 1991), faith in testing as a measureof learning only seems to be growing. Alternatives to state-leve! testing exist (Meier, 1995; Wiggins, 1998). but state and national policymakers seem convinced that standardized testing ratchets up both the accountability of students and teachers as well as 488 Fall 2002

Transcript of When Increasing Stakes Need Not Mean Increasing...

Theory and Kaearch in Social EduculUinFall 2«12, Volume 30, Number 4, pp.488-515O College und University FjLuliy Askcmblyof National Council for the Social Studies

When Increasing Stakes Need Not Mean IncreasingStandards: The Case of the New York State Global Historyand Geography Exam

S, G. Grant Alison Derme-lnsinna

University at Buffalo North Tonawanda Schools

fill M. Gradivell Lynn Pullano

University at Buffalo Catholic Diocese Schools

Ann Marie Lauriceila Kathryn Tzetzo

University at Buffalo Williamsville Schools

AbstractIn New York, slate-level policymakers have invested considerable politicaland economic capital in new tests as both a measure of accountability and asa vehicle for increased educational standards, ¡n this study, we look at how9'" and 10"' grade global history teachers are making sense of the firstadministration of a nezu W' grade global histori/ exam. Building on priorwork, we question the relationship between neiv tests and higher standards.We argue that the teachers in our sample believe the new exam is a poormeasure of learning, that they have made few changes in their teaching beyondadding on a layer of test preparation, and that thei/ see a disconnect betweenthe new tests and higher standards. We conclude that, white state policymakersmay have raised the stakes, they have failed to raise educationai standards.

With the recently enacted "No Child Left Behind" legislation,testing will play an increasingly large role in determining children'seducations and their educational futures. Despite a curious lack ofevidence {Camilli, Cizek, & Lugg, 2001; Stake & Rugg, 1991), faith intesting as a measureof learning only seems to be growing. Alternativesto state-leve! testing exist (Meier, 1995; Wiggins, 1998). but state andnational policymakers seem convinced that standardized testingratchets up both the accountability of students and teachers as well as

488 Fall 2002

educational standards (Heubert & Hauser, 1999; Linn, 2000; Natriello& PaUas, 1998).

PoLcymakers demand that tests perform at least two functions.Traditionally, stale-level tests serve as a judgment on the teaching andlearning accomplished (Messlck. 1988). Here, tests provide scores bywhich to decide how well students, and by extension, their teachers,have used their class time.' This function of standardized tests asarbiters of quality has long been questioned, for while tests caneffectively measure some basic skills and knowledge, they generallyfail to describe higher levels of understanding (Corbett & Wilson, 1991;Darling-Hammond. 1991; Shepard, 1991; Yeh, 2001).- More recer\tly,state-level tests are expected to perform a second function—as a leverfor higher educational standards. Here, the argument is that testsprovide the necessar\' stick to the carrot that new curriculum standardsdangle as a means of impro\ ing classroom instruction (Fuhrman, 2001;Heubert & Häuser. 2000; Porter. 1989; Smith & O'Day. 1991).

ln New '^brk, state-level policymakers have investedconsiderable political and economic capital in new tests as both ameasure of accountability and as a vehicle for increased educationalstandards. Since the early 1990s, New York state teachers have beenpelted with the message that they need to increase their expectationsand their students' academic achievement (Grant. 1997). Newcurriculum standards written to guide that change were followed bynew elementar)-, middle, and high school tests. Clearly of higher stakesfor students and teachers than previous exams, the new tests are alsobeing promoted as a way to leverage ambitious changes in teachers'practices (Grant, Derme-Insirina. Gradwell, Lauricella, Pullano, &T2et2o, 2001).

In this study, we look at how 9"" and 10* grade global historyteachers are making sense of the first administration of a new 10*̂ gradeglobal history exam. Building on prior work (Grant, et al., 2001,2002),we use the global histor\- exam to question the relationship betu-eennew tests and higher standards. We argue that the teachers Ln oursample believe the new exam is a poor measure of iearrung. that theyhave made few changes in their teaching beyond adding on a layer oftest preparation, and that they see a disconnect between the new testsand higher standards. These teachers identify the main issue in ourStudy: that, while state policymakers may have raised the stakes, theyhave failed to raise educational standards.

The Study

The Global History Research Group at the University at Buffaloformed as a result of a larger project that has explored how teachersacross grade levels and subject matters are making sense of changes

Fall 2002 «9

in the New York state testing program. Building on an analysis of thefocus group data collected (Grant. 2000), the lead author organized ateam of practicing social studies teachers and doctoral students asresearch assistants to do individual interviews wilh a wider sampleof Glottal History teachers than were interviewed in the focus groupsessions. The Global History Research Group is in the second stage ofa study of teachers' responses to the changes in the New York stateGlobal History and Geography curriculum and the attendant Regentsexam. This paper draws on individual teacher interview data collectedbefore and after the administration of the first Global History exam.

The purposeful sample we constructed (see Table 1¡ included13 teachers from a total of 11 schools with various years of teachingexperience (novice—1-3 years; experienced—4-1 •) years; veteran—15-32 years), teaching locations (rural, suburban, and urban), and gradeassignments (Global 1/9'̂ grade. Global II/îO'" grade, or both).' Theteachers invited lo participate in the study were from schools close toa large western New York research university where the research leamworks. Each of the teachers selected was known by research teammembers to be aware of the changes in the state curriculum andassessment policies and interested in ambitious teaching and learning.

Table 1: Description of Teacher-Participants

TeacherSite1

2

3

4

5

6

7

8

9

10

11

Gender

M

F

M

F

M

M

F

F

F

F

M

Teaching ExperienceGrade Assignmentveteran/30 years9 and 10novice/l year9 and 10vete ran/32 years9 and 10experienced /8 years9 and 10experienced/11 years9 and 10novice/1 year9 and 10experienced/7 years9 onlyveteran/23 years9 and 10experienced/6 years9 and 10novice/1 year9 and 10veteran/23 years9 and 10

School

urban

urban

urban

suburban

suburban

suburban

suburban

suburban

suburban

suburban

rural

490 Fall 2002

12

13

M

M

veteran/34 vears9 and 10novice/3 years10 only

rural

The first stage inler\iew protocol was administered in spring2000, before the new exam was given. The interview guide includedquestions related to the opportunities teachers had to learn about thenew Global History- curriculum and state tests, their interpretationsof and responses to these changes, and the concerns they had.Following the adniinistra tion of the June 2000 exam, we re - inters ie wedthe teachers during the fall and winter of 2001, The second interviewguide focused on the teachers' interpretations of and responses to thetest in light of the test administration and the experiences of theirstudent5. (See the Appendix for the inter\ iew protocol) The inter\ iewstook approximately 90 minutes each, and they were tape-recorded andtranscribed.

We analyzed the data inductively, taking an interpretiviststance (Bogdan & Biklen, 1982; CcnneUy & Clandirûn, 1990; LeCompte,Preissle, & Tesch, 1993). From that stance, we emphasize theimportance of context and the multiple ways individuals constructmeaning. We also analyzed the data using a constant comparativemethod (Bogdan & BIkíen, 1982; Glaser. 1978). That method assumesthat data collection and analysis are recursive, one infornüng the otherthroughout the course of the study.

We conducted our analysis of the data on two levels. In ourinitial coding, we created domains or categories of meaning thatsurfaced across the interview data (Spradley, 1980), Those domair\shighlighted the relationship between tests and standards, concernsabout student test performance, changes in classroom instruction inresponse to the tests, students' responses to the tests, the test scoringprocess, changes in school passing rates, and general concems abouttests and testing. Our second level of analysis consisted of reviewingthe domains in search of emerging patterns. The three patterns thatsurfaced most frequently highlighted the teachers' perceived negativerelationship between tests and standards, the minimal changes inteachers' practices, and teachers' uncertainties and frustrations withthe new exam.

Reflecting on these patterns in light of the twin functions oftesting—as a judgment of teaching and learning, and as a lever forincreasing educational standards—we synthesized three themes. Oneis that teachers question whether the new global exam is an adequatemeasure of student learning. The second theme highlights the weakinfluence of the new lest on teachers' classroom practices. The final

Fall 2002

theme points to teachers' sense of a disconnect between the new testand higher educational standards.

The New New York State Global History and Geography Exam

As in other states, the New York State Education Department(NYSED) has developed a new state exam under the mantle of raisingeducational standards. The Global History and Geography exam isintended to parallel the new state 9"' and 10'̂ grade global curriculumcontained in the Social Studies Resource Guide with Core Curriculum (NewYork State Education Department, 1999b.) (See an on-line version ofthe Guide at: www.nysed.gov).

'The new global curriculum features a shift from the regionaland cultural emphasis of Global Studies to a chronological approachexemplified in Global History and Geography. Rather than area studiesof world regions, the curriculum now features historical theme units,each set within a distinct chronological period. Unit three. GlobalInteractions (1200-1650). for example, includes attention to earlyJapanese history, the Mongols, global trade, the rise and fall of Africancivi l izations (e.g., Ghana, Mal i , Axum), the Renaissance andReformation, and the rise of nation-states in Europe. Across the 28single-spaced pages of the global curriculum is a predictable list ofworld history items running from the ancient world until today.Con tent-specific questions, instructional suggestions aimed at teachers,and lists of suggested documents are included.

Teachers first saw the intended changes in the state assessmentpolicy through the Regents Examination in Giobai H ist or)/and GeographyTest Sampler. Available to teachers over the state education departmentwebsite (www.emsc.nysed.gov /c ia i /pub.h tml t tca to) , the test"sampler" offered examples of test items, scoring rubrics for the open-ended items, and samples of students' work.

The sampler and the ensuing new exam represent severalchanges over the former Global Studies exam. One set of changesconcerns the multiple-choice questions. Although it is difficult todiscern any fundamental changes in the nature of the multiple-choicequestions (Grant, 2001b). where they were once grouped by region(e.g.. China, Africa), now they are arranged in chronological order.Two more multiple-choice questions (50 rather than 48) have beenadded to the new test. More significant are the changes evident in theessay section of the test. Previously students wrote three essays fromseven thematic essay prompts. Nowstudents write two essays, one ofwhich is a thematic essay; students have no choice, however, on whichtopic to write. The second essay, a Document-Based Question (DBQ).is new to most students.'' A DBQ consists of two parts. The first is aseries of up to eight source documents, such as quotations, political

492 Fall 2002

cartoons, and graphs, related to a single topic. Following eachdocument are one or more questions that probe for the main idea.After completing these "constructed response" questions, studentsrespond to an essay prompt, drawing upon the information from thedocuments and their prior knowledge. On the first Global Historyand Geography exam, the DBQ documents reflected a range of viewsabout communism and capitalism. The essay prompt asked studentslo describe how the two economic systenris attempt to meet the needsof the people and to evaluate how successful each system has been atmeeting people's economic needs.^

Both essays, thematic and DBQ, are graded using a state-developed rubric from 0-5 points," In the past, a total score of 65signified a passing grade. Sixty-five is still the passing score, but schooldistricts may set the mark at 55 for up to tivo years as a transitionperiod. One last change is worth noting; Where the ratio of pointsbetiveen the multiple-choice and essays on the Global Studies examwas 45% to 55°o, that ratio is reversed on the new exams,'

Were it not for two other changes, the new Global Historycurriculum and exam might not have stirred much interest. New Yorkstate f)olicymakers routinely offer neu- curriculum frameworks andrevise the state tests. However, the new revisions stand out in twoways. One is that policymakers couch the need for new policies in thelanguage of higher educational standards. The second difference isthat policymakers ha\ e raised the stakes for test failure.

References to higher educational standards and moreambitious teaching and learning echo throughout the new curriculumand test materials, Tbe authors of the 1996 draft curriculum' state thecase plainly: "New York State is engaged in a serious effort to raisestandards for students,..Classroom teachers,,,must bring reality to theteaching and learning process in order to assure that all of their studentswill perform at higher levels." Foremost in the "strategy for raisingstandards" is "setting clear, high expectations/standards for allStudents and developing an effective means of assessing studentprogress in meeting the standards" (New York State EducationDepartment, 1996. p, 5)," In the 1999 version of the guide, teachers aretold that they must develop "rich, engaging, and meaningful socialstudies programs" (New York State Education Department. 1999b. p,3). In the introduction to the test sampler, references to the essay taskssupport the need íor more substantive instruction (New York StateEducation Department, 1999a). The authors note that, on the thematicessay "students are asked to compare and contrast events, analyzeissues, or evaluate solutions to problems" (p- 1), Document-basedquestions, they maintain, "require students to identify and exploremultiple perspectives on events or issues by examining, analyzing,and evaluating textual and visual primary and secondary documents"

Fair 2002 493

(p. 1), At theendoftheintroduction, the authors ask teachers to workthrough sample student essays and respond to a set of questionsdesigned "to help teachers plan for instruction" (p. 2). Most questionsare procedural, as in "to what extent did the students follow theguidelines included with each question type?" The last question,however, supports the language of more ambitious instruction evidentin the Resource Guide: "What opportunities do K-12 students have toengage in a social studies instructional program that includes writingin the content area, using documents of ail kinds, and engaging inactivities requiring higher-order thinking skills?" {p. 2).

Interestingly, neither the curriculum guide nor test samplerrefer to the second big change; All students must now pass thepresumably more ambitious Regents exam irn Global History in orderto graduate from high school. For non-special education students, thereis no other path to graduation. This change, adopted by the Board ofRegents before the curriculum and assessment changes, does awaywith what many argued was a two-tiered system. Under that system,students could opt to take the Regents exam or the much easier RegentsCompetency Exam (RCT). Students passing Regents exams in theircore academic subjects earned a higher-status Regents diploma, whilestudents passing RCT exams received local diplomas. Arguing thatthe RCT exams and the teaching in non-Regents courses were sub-standard, the Board of Regents declared that all students would nowbe expected to take more challenging courses and more challengingRegents exams.

This policy change clearly ratchets up the stakes for studentsand teachers (Grant, 2000; Grant, et al,, 2001,2002), in that the messagesteachers typically receive is that tests are intended to drive change(Grant, 1997), For example, a New York State Education Departmentrepresentative told one group of teachers and administrators that newtests will "help grow change in the system/' A different NYSEDrepresentative told a different audience. "New assessments willrepresent a change in instruction...Kids won't perform well untii[teachers'] instruction reflects this." At a third meeting. NYSEDCommissioner Richard Mills added, "Instruction won't change untilthe tests change" (p. 271).

The new New York state curriculum and assessment policiesdirect teachers to make substantive changes in their classroompractices. In order to develop activities that require "higher-orderthinking skills" and programs that are "rich, engaging, andmeaningful," policymakers suggest that teachers must do more thantinker at the edges of their practices. Doing so, they seem to argue, isnot only good teaching, but also is essential if students are to pass thenew, more rigorous exams.

494 Fall 2002

Linking higher standards with higher stakes makes policysense, /f teachers need to raise their teaching and learning sights andif state-ievel tests drive teachers' practices, then one means ofencouraging more ambitious teaching and learning would be tocombine new, more ambitious curriculum guides and tests with morepressure on teachers and students to perform. While the empiricalconnection between tests and teachers' thinking and practice iscomplex (Cimbricz, 2002; Clune, 2001; Grant, 2001a), statepolicymakers in New York and across the country hold out hope thathigher stakes will provide the needed incentive lo drive forward asystem that seems stagnant (Heubert & Häuser, 1999; NatrieUo &Pallas, 1998; Smith, 1991).

Raising Stakes Need Not Mean Raising Standards

In this section, we describe and analyze the three themes thatsurfaced h-om our analysis of the inter\ iew data. The first is that theteachers see the new Global History and Geography test as a poormeasure of student learning. The second theme is that, while the testshave inspired teachers to make some changes in their classroompractices, those changes occur more at the margins than at the centerof their teaching. The final theme is that the teachers see a disconnectbetween the new test and the raising of standards, and this disconnecthas caused teachers to become more frustrated than satisfied. While acase can be made that New York state policymakers have raised thestakes for teachers and students, the teachers in this sample suggestthat policymakers have failed to promote higher-level learning amongstudents, more ambitious instruction, and the raising of educationalstandards.

Tests as a Poor Measure of LearningThe teachers in this study wondered if the goals set out by the

New York State Education Department for the new state assessmentwere really being met. According to the authors in the foreword of the1996 draft curriculum, "Assessments are simultaneously ends andbeginnings; they ser\-e both as benchmarks to ascertain what and howwell students are learning and as springboards for further teachingand learning" (New York State Education Department, 1996, p- x).Teachers questioned whether the state assessments truly matched thecurriculum, whether the tests adequately reflected the knowledge andskills students attained from the entire two-year curriculum, andwhether the scoring system was sufficiently rigorous.

Fart 2002

Questioning Die Mifiiuildi Between the Test and CurriculiiinOne rationale for developing high-stakes tests is to promote

higher levels of thinking and to measure student learning. Onechallenge New York state test makers faced was designing an examthat assesses students over the two-year curriculum beginning in ninthgrade. The teachers in our study argued that the challenge was notmet: They asserted thai the lest content strongly favored the tenthgrade curriculum, and therefore did not sufficiently reflect studentknowledge of Global I material, A veteran teacher said, "The test istenth grade oriented,,,I'd say il's probably running about three-fifthsor better of tenth grade," Another experienced teacher concurred:

¡The state's] two-year quote-unquote cumulativeninth and tenth grade information was a joke. It waslike four or five ninth grade questions on there,that,even if you didn't have ninth grade, you could haveprobably answered anyway. It was mostly tenthgrade,,,every thing was pretty much everything wedid in tenth grade.

While most teachers commented on the tenth grade emphasisin the multiple-choice questions, some teachers also expressed theirsurprise about the modern history focus of the essays, A veteran teachersaid she was puzzled by the topic of the document-based question(DBQ) essay:

The DBQ surprised me intensely because I reallythought that It would be based on something thatwould be more encompassing of Global I and GlobalII—[for example,] religions, the impact of religions,[and] conflicts. But it wasn't; it was comparingcommunist and capitalist economies—which is strictlyGlobal II. That surprised me.

This teacher also noted the modern history emphasis in thethematic essay:

...the thematic essay was based on human rights, andwhile that could technically cover Global I and/or II.probably 99 percent of the students are going to applyit to Global II again, because they're going to look atthe Holocaust, they're going to look at apartheid.They're not going to look back to Global I. So 1 wouldhave liked to have seen essays with a broader scope—to cover more of I and II.

4M Fall 2002

To have both essays and the vast majority of multiple choicequestions focused on modern events raised questions for theseteachers, such as what purpose the ninth grade curriculum shouldser\'e in terms of students' learning. After noting the heavy attentionto modem events, a veteran teacher questioned the overall value ofthe ninth grade curriculum for students as weL as teachers:

I would say first of all, on the multiple choice it'sheavily weighted toward tenth grade, which isunderstandable and that's probably what thestudents' best chance of recall is. (But| it makes a ninthgrade teacher wonder really what they're doing.

This teacher and others revealed mixed feelings about the distributionof course content on the evam. Qn the one hand, they said this approachbenefited students because recent material was likely most fresh intheir memories. At the same time, ninth grade teachers perceived adiminished respect for their efforts and for the importance of the subjectmatter they teach.

Compounding the teachers' uncertainty about the disconnectbetu'een ninth and tenth grade material was the omission of a numberof topics and themes they viewed as important to studentunderstanding of global history. A veteran teacher wondered whyworld religions were not included on the exam:

(There was] very little on the major religions of theworld, ivhich was always like a big part of the wholecurriculum. I mean, how do you talk about worldhistory and not talk about world religions—tounderstand what is even going on in the world today?

An experienced teacher also noted the absence of a commorUy taughtera. World War II:

Sometimes they're missing some of the key things...Idon't think there was a World War II question—I don'tremember anything on World War I!. There wasnothing on any of the major conflicts which reallyshaped the world and had the most influence in theworld.

There is no question that the curriculum is comprehensive and thatall themes cannot be assessed on a single exam, but for many teachers,major topics were missing, and thus they wondered if the exam was atrue reflection of the prescribed two-year curriculum.

Fall 2002 * W

Questioning the Scoring ProcessThe strong emphasis on tenth grade content caused concem

among the teachers in our study, but so too did the scoring process.Teachers argued that two elements of the scoring process resulted indramatically higher scores.

One of those elements was the score conversion chart teacherswere instructed to use. Following that chart, students could pass theexam without amassing a single essay point and answering correctlyas few as 72% of the objective questions (Grant, 2001b), Taking note ofthis oddity, an experienced teacher said. "You could do really well onPart One and not have to write the essays, and you could still pass theexam," Teachers also questioned a second element of the scoringprocess, the rubrics. A veteran teacher pointed out that the state'sdiminished expectations urged teachers to award maximum pointsfor limited work:

The way we graded the essays was really soridiculous. Anybody, even if they wrote a couple ofhistorical words, we could give them a couple points.It didn't even have to make sense. The rubric wasdesigned so that everyone got some sort of points.You could basically find the points so that everybodygot a two or a three out of five.

A novice teacher concurred, calling the process "voodoo magic" or"divining.,,throwing sticks when coming up with a grade," The levelof frustration with the scoring process was best expressed by a noviceteacher who recalled that one student with poor attendance still passedthe exam. She stated. "I had one girl that didn't come for 113 days,and she got a 57 (a score of 55 is passing in the district). Now, do 1think she knew 57 percent of the material for social studies? No, But.I mean she got the grade," Clearly aggravated, these teachers seemedto indicate that their efforts toward ambitious and challenginginstruction were being undercut by a scoring system that permittedalmost any student to pass the exam, regardless of his or her level ofengagement with the material.

Questioning Wltether the Tests Measure Depth of LearningFor the teachers in this study, the new Global History and

Geography test did not effectively measure depth of studentknowledge and understanding. One veteran teacher best summed upthe views of most teachers in the study about this issue:

1 don't think the test in any way, shape, or formevaluates what I value about Global Studies,

498 Fall 2002

whatsoever. I think that hecause...I like to challengethe students to think and to look interpretively atdifferent approaches and different ways,..and then wegive them this black-and-white, cut-and-dried test thatsays, 'This is the way to evaluate what you know.'And I don't think it does because I think they knowso much and they're capable ot learning andunderstanding so much. But do we allow them toleam? Do we allow them to explore? No, we don't.

Tesis fls a Weak Influence on Teachers' Instructional PracticesTests do matter. How they matter and to what degree, howe\ er,

is an issue of considerable debate (Cimbricz. 2002: Clune, 2001; Grant.2001a; Miller. 1995), Despite a lack of empirical evidence, statepolicymakers appear con\'inced that new. high stakes tests will drivemore ambitious curricular and instructional practices (Euhrman, 2(X)1;Shepard. 1991; Stake & Rugg, 199Î).''' The teachers in our sample doattend to the new global test, but in what direction are their effortsmoving? In this section, we argue that teachers have made few specificcontent or instructional changes in their practices based on theirreadings of the new Global Histor\' test, that the changes they havemade generally lie at the surface of their practices, and that thosechanges are keyed more to raising test scores than to improvingteacfung and learning. The classroom changes that these teachers aremaking in response to the new exam, then, seem more conser\ ativethan ambitious.

The Limited Effects of Tests on Teachers' PracticesWhile the debate over high stakes testing generally concerns

student test performance (Heubert & Häuser, 1999; Natriello & Pallas.1998). New York state policymakers believe new tests will promoteimportant changes in teachers' practices (Grant. 1997), The teacherswe inter\'iewed questioned this assumption, A veteran teacher said:

We teach them to a test. And 1 think that is a drastic,just very, very sad testimony to the education in NewYork State, We don't encourage learning. Weencourage teaching to a test. So. as soon as these kidsare done with this test, do they go on to question? Dothey go on to desire to know more? No, they don'tbecause the test is out of the way They don't care.And I think that's a shame; I think that's such a waste.

For this teacher and others, the new test acted less as a catalyst formore powerful teaching and learning than as its opposite—pedantic

Fall 2002 «9

teaching and rote learning. Moreover, she doubted the state's claimthat "real teaching shifts continuously in response to the needs ofstudents as they strive to understand the content and to demonstratetheir understanding in a variety of assessment contexts" (New YorkState Education Department, 1996, p. iv).

The teachers we studied were making changes in theircurriculum, instruction, and test preparation, but those changesseemed more additive than substantive, and designed more to raisetest scores than to stimulate powerful learning.

Tinkering ii'ith content. As we reported in studies conductedprior to the new global test (Grant, et al., 2001, 2002), virtually all ofthe teachers said they followed the state curriculum change from theregional and cultural focus of Global Studies to the chronological,history-based focus of Global History and Geography. Severalexpressed doubts about this move, but they made the changenonetheless. Curriculum changes in response to the first global historyexam are, by contrast, much less apparent. Some teachers in this studyseemed simply to be tinkering with their content, while others werestill wondering what the changes in the state exam really meant.

Teachers in this study worried about covering the thousandsof people, places, and events listed in the curriculum, but we coulddetect no profound changes in the content they taught. As a veteranteacher noted, "The curriculum is basically still the same. Theyformatted it a little different, but the information is basically still thesame in what you have to convey to the students." Teachers said theywere tinkering with the amount of time they spent on each unit, butonly three reported making any substantive changes. A novice teachertalked about reducing the amount of time she spent discussing therule of Attila the Hun: "We only spent a day or two on it. but I can'tspend the time to explore that because I really..! know it's not on theexam and I can't waste the time." Another novice teacher reporteddecreasing the attention to art and literature in her second year ofteaching. She said that she had not eliminated such studies, but thatthey played a lesser role in her teaching;

Ido very little exploration of art and literature, whichmy first year of teaching I did a lot of, and it'ssomething I really en]oy. I shouldn't say that. 1 do verylittle art anymore. 1 still do some literature, poetry inparticular. But, the art thing—it's not there on theexam and it's not in scope and sequence, and as muchas I love teaching it, I just don't anymore.

In direct response to the thematic essay on human rights, anexperienced teacher said, "I'm definitely spending some more time

500 Fall 2002

onhumanrightsandattempting to not spend as much time on earlierunits in the ninth grade, because there was one question on Rome andwe spent—you know—three weeks on Rome. So, r\'e tried to elimma tesome ofthat."

This experienced teacher was also the only one to assert thatshe was changing her curriculum in response to a second testphenomenon: the prevalence of lO'" grade material covered on theexam," The new exam features 31 questions covering modem limes,compared to 17 related to earlier periods (Grant, 2001b).'-This disparityseems odd in light of the fact that the ninth grade material is givenexactly the same amount of attention in the curriculum guide as thetenth grade material ( 14 pages each)." Every teacher we interx iewednoticed the predominance of test questions on the modern era, butonly the teacher described above said that she would make any explicitchanges in her content to reflect that situation.

While other teachers noted the emphasis on 10'" gradematerial, none would commit to making wholesale changes. In fact,several teachers seemed uncertain about what sense to make of thenew exam. This point came through in a veteran teacher's musingsabout the relationship between the curriculum and the test:

This exam was all tenth grade and that was sort ofsurprising. If you're just gonna teach from the FrenchRevolution on.,,I mean, if that's all you're gonna teston, that's all you're going to have to teach them...Ifyou have been spending 10 weeks on prehistoric andearly River Valley civilizations, why, if they're not onthe test? Not that you should only leam what you'regoing to be tested on, but I mean I guess,,,youwouldn't put the emphasis on it that you do. Youcould probably sort of move a little faster through it.1 mean, at this point you're afraid not to spend aconsiderable amount of time on the four earliest rivervalley civilizations, because it's in their (tbe state's)resource guide—and anything that's in the resourceguide is game for them to ask—well you could sort ofcondense it and move a little faster to get to the partthat they're gonna be tested on.

The tentative nature of this teacher's comments is noticeable. She is awell-regarded teacher who actively participates in botb local and statelevel professional development opportunities. She appeared to caredeeply about the content and her students. And yet, she struggled tounderstand state curriculum and assessment policies that offeredconflicting positions.

Fall 2002 501

Adding-on test practice. As with their content decisions, teachersattributed no profound changes in their instruction to the new globaltest. Some teachers continued to make ambitious changes in theirteaching (Grant, et al, 2002), but they attributed those changes to theirown quest for improvement rather than to the test, which few saw aspromoting higher standards. Teachers' instructional responses to thetest, then, consisted almost entirely of adding on practice test questionsand teaching test-taking strategies.

New York state teachers have long used Regents-like questionsin their instruction. Introduction of the new Document Based Questionmeant that these teachers used more DBQ-style questions during theirteaching units, but they seemed to do so more as an additive measurethan as a wholesale change in the way they taught. A veteran teachersaid, "The only real difference I could see in what we're doing is [thatwe are] working on these Document Based Questions." An experiencedteacher concurred: "I have done a few more DBQs. 1 did some lastyear—I did quite a few. But this year I've tried to make sure that I doone with every unit." The one exception to this pattern was a veteranteacher who reported learning at a professional conference that thestate exam would put more emphasis on the multiple-choice sectionthan on the essays. Taking this advice seriously, he asserted that hegave substantially more attention to multiple-choice question practicethan to DBQs:

1 concentrated on the multiple-choice questions...I hadbeen told to do it that way at a conference I was at, soI had to make a move in what I consider the rightdirection. And it did pay off. No question about it. Itbecame very familiar. I'm not saying we avoided theessay, but we could be certain, almost certain whattype of question you would have on the multiplechoice.

Adding-on practice in multiple-choice questions was a strategic moveon this teacher's part. He realized that students' performance on theessays counted for far less than it did on the objective portion of theexam, and so he made a deliberate instructional change. Like the otherteachers interviewed, however, he denied making dramatic changesin his teaching practice. His relatively traditional instruction based onlecture and recitation remained in piace; the difference was thelayering-on of test question practice.

Along with adding practice questions, several teachers talkedabout teaching test-taking strategies, especially essay writing. A noviceteacher said: "We concentrated a lot on essay writing, because 1 feltthat covered both the content they needed to review and the skills of

S02 Fall 2002

essay writing that the Regents exam would be expecting." Anexperienced teacher noted the poor writing skills her freshmen broughtto class and her efforts to help students organize their thoughts:

I think if they can get more organized in their writingformat, and hopefully they'll have the contentinformation there in the rubric form, then they shouldbe able to do quite well. The freshmen are very weakthere so you've got two years to try to get them shapedup,

A veteran teacher focused more on skills related to reading historicalsources, pointing out features of documents, and questions to pose tooneself while reading:

So what I'm doing is giving them the actual documentas written. And when they will ask, "Well I don'tunderstand what this means," then we'll take it apartand say, "Look at the context. What is the major pointthat is being made there? If you take that one wordout, what does the sentence say?" So it's almost goingback to basic English structure, sentence structure. I'vebeen finding myself putting more and more emphasison the use of primary sources, on the use ofdocuments, interpretive skills, and even in theclassroom, trying to encourage more brainstormingof ideas—to give them a quote, and asl;ing, "What doyousee the author meaning by this? How does it relateto you?"

Pushing students to corTsider the context of a document andits author's intention promotes an entirely different level ofunderstanding than does traditional textbook study And the fact thatthis teacher's actions arose in the context of the state exam isnoteworthy. But so too is fhe fact that she stood alone: She appearedto be the only one of the sample to make an ambitious change in herteaching that was attributable to the new test. Other teachers seemedto be doing ambitious things in their classrooms, but none attributedthose actions to the exam,

Littie change in test rei'iew- New York state teachers typicallyschedule a test review period just before the June Regents exam.Wondering if the new test would encourage teachers to allot moretime for test preparation, we inquired mto differences between theircurrent and past practices. While the amount of test review time varied

Fall 2002 . 503

considerably, only one teacher said that he devoted more time to testreview than he had in the past.

Sample teachers reported a range of 2-6 weeks for test review.Teachers said that they were using some new review materials—Prentice Hall has published a set of review books that include sampleDBQs that several teachers use—but they were not adding sigruficantlyto the review period they had scheduled in the past. An experiencedteacher explained: "We spent the same amount of time (four weeks).The only thing different was just working on some more documents,"

The only teacher to report an increase in time for test reviewwas the veteran who decided to focus on multiple-choice questions.This year he not only focused more directly on multiple-choicequestions, he also began reviewing almost a month earlier than hehad in the past:"

Normally, we had review classes maybe starting inlate May, but they started much earlier [this year]. AndI think we probably went into more detail comparedto other years. Other years we concentrated on theessay to a degree too. But this year we didn't do itnearly that much. Time-wise I guess we definitelyspent more time this year.

The Minimal Influence of Tests on TeachingCommon in the testing literature is the claim that high stakes tests

automatically suppress powerful teaching and learning (see, forexample. Madaus, 1991). Critics suggest that curriculum becomessignificantly narrowed, instruction becomes increasingly pedantic, testpreparation dominates classroom time, and test score fever sets in(Corbett & Wilson. 1991; Smith, 1991), We saw some evidence of thesetendencies in the teachers we studied. Overall, however, we detectedno profound change in the teachers' practices. These teachers werenot making ambitious changes in response to the test; the changesthey were making tended to be superficial and/or additive. In thislight, the new global exam adds up to far less than the powerfulinfluence New York state policymakers expected. But it also adds upto something less severe than critics of high stakes testing would assert.

The slight narrowing of content, the addition of test questionpractice, and the teaching of test-taking strategies make sense giventhe new testing context. Given the persistent state-level talk about theneed to raise standards, the need for a new test format, and the needfor all students to take and pass the presumably more rigorous Regentsexam, it is no surprise to see teachers making classroom changesdesigned to raise students' test scores. That they made neither theambitious changes policymakers predicted nor the deadening changes

504 Faf[ 2002

test critics predicted is curious. Those who study educational reformsoften cite risk, resistance, and bureaucracy as explanations for theuncertain and glacial pace of classroom change (Fullan, 1993; Sarason,1990; Tyack & Cuban, 1995). We do not doubt the importance of suchnotions, but we propose that two context-specific factors may matteras much or more.

One reason these New York state teachers' practices may havechanged little is that their fears subsided as they learned more aboutthe new test. Teachers who parhcipated in the s ta te-sponsored practicescoring sessions universally reported feeling less anxiety when theyrealized that their expectations for student performance on the essayswere higher than the state's (Grant, et al., 2001). Teachers continued toworry about the actual exam, but most told us that they felt confidentthat their students would do well.

A second reason for the slight incidence of changed practice isthat teachers received very little in the way of professionaldevelopment designed to promote more ambitious teaching andlearning. Changes in the global curriculum and test began in the mid-1990s. White the professional development specialists during that timetold teachers about the changes, they did little to challenge teachers'prevailing practices (Grant. 1997. 2000). The problems of weaklydesigned and enacted professional development are well noted(DarUng-Hammond & McLaughlin, 1996; Novick, 1996; Smylie, 1995).Among those problems aie the messages broadcast during professionaldevelopment sessior\s that urge change, but fail to develop and supportthe means to enact those changes (Grant, 1997). With little beyondrhetoric, it makes sense that teachers would go slowly in makingpedagogical changes.

The Disconnect Between High Stakes Tests and Raising StandardsWhile trying to make sense of the highly touted and long

anticipated test, teachers time and again pointed to the discormectbetiveen what they anticipated and what arrived in the form ot" thenew global test. O\-erall, there was a belief that the test itself was easierthan expected and that the scoring process contributed to that belief.If the new test was meant to institute higher educational standards,this connection was not evident to the teachers in our study.

Teachers' uneasiness with the new state assessments hasshifted focus. In our earlier study (Grant, et al., 2001), teachersexpressed deep concerns before the administering of the new examabout possible high student rates of failure because of the perceivedemphasis on raising standards. This fear seems to have subsidedconsiderably, given the surprising passing rates in schools across thestate. Instead of declining, the statewide passing rate for the ostensibly

Fall 2002 S05

more difficult Global History and Geography exam actually rose byclose to 10 percentage points,"

/In Easier Test?A strong consensus emerged among the participants that the

test questions did not meet their expectations of a more challengingassessment, A veteran teacher remarked, "I felt that the actual testwas somewhat of a cakewalk, that,,,I personally bad anticipatedsomething very difficult, something very complex," Several teacherssaid that the new exam reminded them of the Regents GompetencyExams (RGT) taken in years past by students as an alternative to themore difficult Regents test. An experienced teacher observed that themultiple-choice questions, in particular, "were like RCT multiplechoice,..very generic."

Several teachers also argued that the DBQ was easier thanthey anticipated. Common across their observations was the sense thatthe documents were fairly simple to interpret and that the essayprompt, comparing views on capitalism and communism, was tooshallow. An experienced teacher noted that the availability ofdocuments helped students begin their writing, but that tbe particularprompt was not as challenging as it could have been, "[The documents]do get tbe juices flowing—I guess—for thought," he said, ",,,butofallthe things to write on, I think that's pretty, that was really simplistic,"

A few teachers characterized the test questions as "fair," Ofthe multiple-choice questions, one veteran teacher said, "I think it's afair test,,,,It covered tbe material we expected." Another experiencedteacher echoed this sentiment about the DBQ, saying that she thoughtit also was fair. These teachers appreciated the fact that there were no"tricks" or unduly difficult questions on the exam. Each added animportant caveat, however: Though fair, the test was also a lot easierthan they anticipated. The teacher who called the DBQ "fair" alsostated, "I didn't think you had to have much knowledge of capitalismor communism to answer the question. It was all there for you," Thefirst teacher quoted above added, "We had a few students who shouldactually not pass with the effort they made throughout the year Butthe exam was easy enough that tbey were able to get through it," Anexperienced teacher summed up this mixed sentiment: "1 was happywith [the test], I thought it was fair...[but] I think it was maybe easierthan some of the stuff we had seen ahead of time,"

While a few teachers offered the view tbat the test was fair,more seemed annoyed and/or angry that the test did not meet theirexpectations. A veteran teacher, active in state social studies circles,reflected the feelings of several teachers in expressing her frustrationwith the exam:

506 Fait 2002

I think teachers throughout the state were appalledwhen they saw the exam...I heard that many teacherswrote when they filled out the evaluation sheet...thatit (the test) was almost hke an insult.

Another veteran teacher recalled her surprise at the exam's simplicityas she distributed it to her students: "You pass them out, and thenwhile they re working on it, you're standing there and going throughit and thinking, 'You've got to be kidding me. I mean you (the state)made me ner\-ous for four years prior to this?"

An Easier Scoring ProcessIf the test questions seemed easy, so too did the scoring process.

As noted earlier, teachers belie\'ed the scoring system of the new examdid not adequately reflect students' higher-level learning and failedto promote higher standards. An experienced teacher reflected theobservations of several teachers in noting how the scoring rubric gaveaway too much:

The rubric they used for grading it: If the studentrestated the question, it's one point. If he restated thequestion and kind of hinted at any answer, it wasminimally two points. So, it was almost impossiblefor [students] not to really do well. I think it wasdesigned for them not to fail.

Teachers noted the difference between their expectations tor studentwork and the grading standards expressed by the state. Ironically, inthe teachers' estimation, the state seemed to promote ¡OZLVT standardsthan they did:

I feel that our grading standards here (at school] weremuch more rigid and much higher than what the stateexpected. And in fact if you look through the gradingrubric from the state, even to get a four, they will stillallow some incorrect information. Well, if you'regiving a four out of a five, as far as I'm concerned,there should be no incorrect information in that.

Anovice teacher was even more blunt in her assessment of the scoringprocess:

The way we graded the essays was really soridiculous, anybody, even if they wrote a couple ofhistorical words, we could give them a couple points.

Fall 2003 507

It didn't even have to make sense. The rubric wasdesigned so that any, everyone got some sort of points.You could basically find the points so that everyonegot a two or a three out of five.

The combination of easy test questions and a scoring processthat seemed to unduly reward students suggested to several teachersthat the exam was purposefully designed to ensure a high passingrate, A comparison of the state passing rate for the old Global Studiesexam with the new Global History and Geography exam lendscredence to that view. Whereas 60.97o of tenth graders passed the oldGlobal Studies exam, 68,5% passed the new, presumably more difficultGlobal H istory exam. This increase in test score performance is moreremarkable given that virtually all 10"' graders took the new exam asschools moved to drop the RGT option, A veteran teacher expressedthe prevailing sentiment that state officials deemed passing scores moreimportant than a sound exam. Thinking aloud as a state policymakermight, she said, "We better be careful because we want the first year'sresults to be really, really good, so we can say. 'Look, we raised thestandards—that the students rose up and met the challenge head-on.Therefore, we have proven that our efforts are successful.'" Anexperienced teacher was more blunt: "It's dumbing down the contentso they (the state) get higher numbers,,,You can do anything you wantwith numbers, but it doesn't show that [the students] are gaining,"

New Tests Do Not Equal Higher StandardsSkepticism about the state's efforts abounded, as teachers

strove to make sense of the new global tests. Growing increasmglyfrustrated at the disconnect between the message being sent—that ofraising the state standards—and the message received—that of anassessment wrought with scoring loop-holes and less-than-ambitiousquestions, teachers questioned the relationship between higherstandards and the new exams:

No, 1 don't think [the test] reflects higher standards.But they're [the state) going to be getting differentnumbers that they'll be happy with to say thateverything is working. But the kids don't know asmuch or aren't doing as well as they could be doing.

We [the teacher's colleagues] sat there and chuckledand said, "Oh yes! We are raising standards, aren'twe?" A billy goat can pass them, but that's okay.

SOB Fall 2002

I don't think [parents] have a clue as to what the testactually consisted of. And 1 think if parents actuallysat down and went through this test and saw whatwas being done—it's almost laughable. 1 think theywould be appalled.,.if someone asked me. "Do youfeel the state is, in fact, raising the standards, and isthis test a valid, accurate assessment of higherstandards?," 1 could not lie. I would honestly say, "No,I don't feel it is."

Stake (1991) notes that teachers have long been suspicious ofstate claims that tests will raise student achievement, arguing thatteachers "have essentially no confidence in testing as the basis of thereform of schooling in America" (p. 246), The teachers in this studyagreed. Given the steady talk about the need to raise educationalstandards and the performance of students on the state test, none ofthe teachers expected the test questions to be less difficult than the oldexam. While a couple of teachers termed the new exam "fair." virtuallyall talked about how "easy" it seemed. Expecting the new exam tochallenge both them and their students, they said that it had doneneither. Frustrated with the test questioris and scoring process, andskeptical about the state's role in cor\structing the exam in this fashion,teachers discounted the connection betiveen the new test and highereducational standards,

"A Sad, Sad Case": Teachers'Frustrations With the New Global ExamThe teachers in this study evidenced mixed feelings about the

Global History and Geography exam and its relation to higherstandards. On the one hand, they were pleased - and perhaps relieved- with high passing rates and the success of their students. On theother hand, most teachers questioned the validity of the test as a meansto measure students' learning, as a lever to advocate more ambitiousteaching, and as a vehicle to raise standards. An experienced teachersummed up the feelings of virtually all those we interviewed in herbelief that the test was more frustrating than constructive:

I think it's a sad. sad case that after two years ofstudying in the great detail and depth that we do, thatyou have 50 questions and two simple, little essaysthat someone not even taking the course probablycould watch the Discovery Channel a couple of weeks,and get through the essays. I'm not impressed with itat all.

Fall 2002

Curious as to what ali the fuss over higher standards has been about,teachers in the study seemed to wonder whether the state had nottaken a step backwards.

Implications and Conclusions

Changes in state policy mean that state-level testing takes onadded importance in New York state teachers' and students' lives justas it has in the lives of teachers and students across the U. S. (Fuhrman,2001). The requirement that all students take and pass Regents-levelexams in order to graduate fundamentally alters the educationallandscape. Reducing high school graduation to a series of test scorescan undercut the value of teachers' and students' regular school work;several teachers told us about students who failed their courses, butwere eligible to graduate because they passed the state exam. Elevatingthe importance of test scores also increases the pressures on teachersand students, in that graduation stakes could not be higher: Eitherpass a test or fail to graduate.'" These implications are not unique toNew York: A recent survey of social studies teachers (Burroughs, 2002)confirms that test pressures are increasing nationwide.

New York state policymakers justify the ratcheting up ofgraduation stakes largely on the assumption that higher standardswill follow—that teachers will teach in more robust ways and studentswill learn in deeper and richer ways. This is indeed a powerfulassumption, but what if it does not hold up? New York state studentspassed the first global exam with seeming little difficulty, yet teachers'questions about the difficulty of the exam and the attendant scoringprocess undercut any real sense of accomplishment. Shepard (1991)argues that high stakes testing may mean that it is "possible to raisetest scores without increasing learning" (p- 233). We ask, then, what ifraising the stakes fails to raise standards?

Two possible implications surface from this study, one positiveand one negative. A positive outcome may be that teachers becomemuch less concerned about focusing so heavily on the test. None ofthe teachers in this study were yet willing to believe that the test couldultimately be as easy as this first administration, but if this proves tobe the case, teachers may cut back on or abandon both some of theiranxieties and their more explicit test preparation activity Historyteachers are unlikely ever to believe that they have sufficient time toteach their content. However, reduced concern about the difficulty ofthe state exam could prove salutary, as they may feel that they canteach more intellectually engaging material.

Make no mistake, however: We argue that teachers who domake more ambitious pedagogical moves, overall, will be doing so inspite of rather than because of the state's efforts. Most of the teachers

CIO Fall 2002

in this study did not tell us that tbey were doing more engaging workas a result of the new test; moreover, several suggested that they feltsome pressure to scale back on their current efforts. The positivepotential of an easier state test, then, will develop only if teacbers seizethe initiative (Grant, in press).

That positive potential may be complicated by a secondimplication: teachers' increasing skepticism of the state educationdepartment. As their frustrations grow, a scenario can develop in whichit becomes more and more obvious that the state and the teachersdistrust each other. On the state's distrust of teachers, McLaughlin(1991) obser\ e5. "Ironically, accountability schemes that rely on existingtesting technology trust the system (the rules, regulations, andstandardized procedures) more than they trust teachers to makeappropriate, educationally sound choices" (p. 250). Percolating throughthis study is evidence of a corresponding mistrust. After enduring yearsof rhetoric that accuses them of failing to meet high standards, teachersare understandably aggravated by state efforts tbat undercut tbeirinstructional ambitior\s and look like a lowering of standards.

Billed as a means of holding students and teachers accountableand as a lever for pedagogical change, the new high stakes New YorkState Global History' and Geography exam may do tbe former, butseems hard pressed to do the latter, TTte new test is different from theold in some ways, but those differences, teacbers in this study assert,have made the test less challenging rather than more. Teachers whosetendency is toward more ambitious teaching and learning are likelyto breathe more easily, but so too are tbose teachers who only modestlychallenge their students, for it appears that the high stakes that tbeslate seemingly has created may pose no particular difficulty. Whenteachers perceive that they hold higher expectations of their studentsthan the state does, tbe relationship between high stakes and highstandards may well dissolve.

Notes'Messick (¡988) offers a helpful review of the many uses tests may serve,While our purpose in this paper is to locus on the use of tests as a lever for highereducational standards, it is worlh noting thai the empirical evidence suggests (hatleachers question the tradihonal funchon of tests as a means of assessing studentunderstandmg (Shepard. Flexner. Hieben. Marion, Mayfield. & Weston, 1996),' The first year sample included 16 teachers from 13 schools. During the second year ofthe study, one teacher moved out of state and two others were reassigned such thatthey no longer taught global history; thus the second year sample was 13 teachers.

* The reason for the qualifier "most ' is Ihat the DBQ has been a standard feature ofAdvanced Placement exams for many years,' See Grant (2001b) for an analysis of the first global historj- DBQ,* The constructed response questions are graded on a two-point scale • 1 for anacceptable answer: 0 for a missing or unacceptable response,'Or so the fwlicymakers assert, A receñí analysis suggests the essays are actually worthonly alKiut a quarter of the grade (Grant, 2001b),

Fall 2002 SU

"Stale policy maktrs circulated a drad curriculum framework in 1996. The curriculumrepresented m Ihe 1999 Social Studies Resource Cuide u/ilh Core C't'riciilum is describedas the final version.'The other two elements of the "strategy" are "building Ihe capacity of schools/districtsto enable all studenK to meet standards" and "making public the results of theassessment ol btudunt progress through schoni reports" (p. 5).'"Much hai been madi?of thcTena^ "miracle" in education, driven, some would argue,by an aggressive itate tcbting program Recenl analyses of the Texas data, however,undfrcuf much of the initial enthusiasm (Haney, 2000; Klein, Hamilton, McCaffrey, &Stecher, 2000)." Most teachers report covering early man through the French Revolution in grade 9and contmuing from there in grade 10."Three additional questions span these time ranges and Ihu^arf difficult tocategorUe." It also seems odd given that handouts provided at stati'-sponsored workshops andin the 1996 draft curriculum contain a chart of anticipated numbers of questions fromeach part of the curriculum Analysis of these charts weights the tenth grade content(60%) slightly more heavily than that of ninth grade (40%)." The teacher's claims about the effectiveness of his change are undercut, however, bya review of students' test scores: While B4% of Ihe students passed the exam in 2000,81% had passed m 1999'"' Previous to this gain, the biggest increase ¡n state-wide passing rates had been around5 percentage points."There may be additional costs, as school districts are responsible for helping studentswho do not pass the exam.

ReferencesBogdan, R., & Biklen, S. ( 1982). Qiialitalin- reseanhfar education: An mtroduclion lo theory

and nielhods. Boston: Allyn and Bacon.Burroughs, S. (2002). Testy times for social studies.Socinl Educalion, 66(5). 315-319.Camilli. G., Ciïek, G., & Lugg. C. (2001). Psychometric theory and the validation of

performance standards: History and future perspectives. In G. Cizek (Ed.),Setting; performance standards: Conceplf, methoáf, and perspecUves (pp. 445-476).Mahwah, NJ: Lawrence Erlbaum Associates

Cimbricz, S. (2002). State testing and teachers' thinking and practice: A synthesis ofresearch. Ediicalional Policy Anatyiis Arciiivci. 10(2). Available: ht tp: / /epaa.asu.edu/epaa/vlOn2.htm1.

Clune, W. (2001). Toward a theory of standards-based reform. The case of nine N5Fstatewide systL-mic initiatives. In S. Fuhrman (Ed.), From the capítol h Ih,-classroom Standnrdi-baaeil reform in the ítatcí (pp. 13-38). Chicago, IL:University of Chicago Press.

Connelly, F M., & Clandinin, D. J. (1990). Stories of experience and narrative inquiry.Ediicatioiinl Reaearclier, 19(4), 2-14

Corbet t .H.D,& Wilson, B. (1991). Ti-.̂ Un̂ . n/urm, nnrf rfM/ion. Norwood, NJ: Able*.Darling-Hammond, L (1991). The implications of testing policy for quality and equality.

Phi Delta Knppan, 73(3), 220-225.Darling Hammond, L., & McLaughlin. M. (1996). Policies and Support professional

development in an era of reform. In M. McLaughlin & 1. Oberman (Eds),Tcadh'r learning: New policies new practices (pp. 202-219). New York. TeachersCollege Press.

Ftillan, M. (1993). Cliangcforces. New York: Falmer.Fuhrman. S. (Ed.). (2001) Fromthc capítol to Ihc claííroom Stundards-based reform in Ilie

stale. Chicago, IL: University of Chicago Press.Glaser, B. (1978). Theoretical semitivilf/. Advances in Ihe methodology of grounded theory.

Mill Valley, CA: Sociology PressGram, S. C. (1997). Opportunities lost: Teachers learning about the New York state

social studies framework. Theory nnd Research m Social Ediicalion. 25(3). 259-

512 Fall 2002

287.Grant. S. G. (2000). Teachers and irats: E\ploring teachers' perctpliorö of changes in

the New loik slate testing program. EJucalional PoUcy Anati^iis Ardtwes.^M).Available. http://epaa asu edu/epaa/v8nl4.html.

Grant, S. G, (2001ii¡. An uncerlain lever: Thi- influence oí state-level testing in NewYork State on teaching social studies, TfúcInr.sCinflfSí RÍÍ Í" ' ' , 103(3), 398-426.

Granl.S, G. (2001b) When an "A" isn't enough: Analyzing the New ^ork state globalhistory eKam. Zducational Poticy Analysis Arclm-ei. 9(39). Available: hi tp: / /epaa.asu.edu/epaa/v-9n39.html.

Grant, S, G. (in press). Hislury IfSri'ns Tiaching. teaming, and iisliTj m U S. high iiboolclassroonii. Mahwah, N|' Lawrence Erlbaum Associates.

Grant. 5, C., Derme-insinna. A.. Gradivell, J.M., Lauricella, A,. Pullano. L.. it Tzetzo.K. (2001). Teachers, tests. anJ lensiùns Teachers respond to the New Yorkslate global history exam. Inlenwli-irial S-xial Sliidiei Fariim. 1(2). 107-125,

GranI, S. G., Derme-lnsinna. A,. Gradivell, J.M., Lauricella, A.. Pullano. L., k Tzetzo,K. (2002). luggling Iwo sets of books: A leather responds to the new globalhistory e:iam. ¡ournal of Curriculum anJ Supircision, 17(3), 232-255.

Haney. W. (2000), The myth of the Te:ias miracle \n education. Educalional Potiqi Analt/íiíArchives. S(41). Available: http://epaa,asu.edu/epaa/v8n41.

Heubert, |., & Häuser, ft (1999). High slakes: Tt^Im^^r Iracking, promolian, andgradiialion.Washington, DC: National Academy Press.

Klein. S., Hamilton, L . McCaffrey, D., i Stecher. B. (2000). What do test scores in Texastell us? Educationul Potiqi Anatpii Archives. 8(49), 1-15.

LeCompte. M., Preissle, J., i Tesch. R. (1993), Eíhniígraphij and qualitative disign ineducational researcli (2nd ed.|. New York: Academic Press.

Linn, R. (2000). Assessments and accountabilit>'. Ediicationat Re^archer, 29(2). 4-16.Madaus. G. (1991). The effects of important tests on students. Phi Delta tCappan, 73(3),

226-231,McLaughlin. M. (1991). Test-based accountability' as a reform strateg>. Pin Delia Kappan,

73(3). 248-251,Meier, D. (1996). Tlii power of their ideas. New York: Beacon,Messick.S. (1988). Assessment in the schools: Purposes and consequences. In P. Mes5ick

(Ed.). Conlribuling lo educationai change: Perspeclives on research and practice(pp. 107-125), Berkeley. CA: McCutchan.

Miller. S. (1995). Teachers' responses to test-drii'en accountabilit)' pressures: "If I change.will my scores drop? Reading Research and tnslruclion. 34(4), 332-351.

Natriello, G.. it Pallas, A. (1998). The development and impact of lugh slakes trsling.[Unpublished paper]. Available: h t tp : i ' /www.columbia .edu/-gjn6/histake.html.

New York State Education Department. (1996) Social studies resource guide. Albany.NY: Author-

New York State Education Department. (1999a)- Regents cxaminalion in global historyand geography test sampler. Albany. NY: Author.

New York State Éducation Department. (1999b). Social studies resotirte guide with corecurriculum Albany, N'Y: Authof-

Novick, R. (1996). Actual schools, possible practices: New developments in professionaldevelopment. Education Policy Analysis Archives. 4(14). Available: h t tp : / /epaa.a5U.edu/epaa/v4nl4.html.

Porter. A. (1989)- External standards and good teachmg: The pros and cons of tellingteachers what to do. Educational Eiiilualion and Policy Analysis, 11(4), 343-356.

Sarison.S.H990¡. The prediclablefailurrofeducalional reform: Can we change course before

ils loo lalf? San Francisco: Jossey- Bass.Shepard. L- (1991). Will national tests improve student learning? Phi Dilta Kappan. 73(3),

232-238.

Fall 2002

Shepard. L, Flexner. R.. Hiebert, E.. Marion, S,. Mayfield, V„ & Weston, T, (1996). Effeetsof introducing classroom assessments on student learning. EducationaiMeasiireniinl: Issues and Piaclke. IS. 7.18,

Smilh, M , & O'Day, J, (1991). Systemic school rtform. In S. Fuhrman & B, Malen (Eds),The politics of curriculum and testing (pp, 233-267). New York: Falmer

Smith, M. L, (1991), Put to Ihe lest: The effects of external testing on teachers, EducahonalResearcher, 20{S).ñ-U.

Smylie. M, (1995), Teacher learning in the workplace; Implications for school reform.In T, Guskey & M, Huberman (Eds,), Professional development in education:New paradigms and practices (pp. 92.113). New York: Teachers College Press,

Spradley. ). (1980), Participant observation. Fort Worth. TX Harcourt Brace Jovanovich,Slake. R.. & Rugg. D. (1991), Impact on the classroom. In R E , Stake (Ed,), ^dforrcfs in

program evaluation (Vol, 1. pp, xix-xxii), Greenwich, CT: JAi Press,Tyack, D,, & Cuban, L. (1995). Tinkering toward utopia. Cambridge: Harvard University

Press,Wiggins, G. (1998). Educative assessment. San Francisco: Jossey-Bass,Yeh. S, (2001). Tests worth testing to: Constructing state-mandated tests that emphasize

critical thinking. Educational Researcher. 30(9), 12-17,

AppendixGlobal History Research ProjectFollow-up InterviewFall, 2000

1, Please describe any efforts you made to prepare your students forthe test.

Probes. Did you use any commercially produced materials (e,g.,test question booklets) or materials other teachers have developed?Were you encouraged to do any particular kinds of test preparationby your administrators? How did your preparations this yearcompare with those in previous years? How did students respondthis year, and how did those responses compare with previousyears?

2, Please describe your general impressions of the global history testyou administered to your students last June.

Probes: Did you read the test questions at the time? Later? Howdid the actual e\am compare with your expectations, based onthe test sampler and/or the state standards?

3, Please describe your impressions of each section of the test (take acopy with you), multiple-choice, thematic essay, short answer/documents, document-based question.

Probes: For each section, ask: What was your impression of thetasks represented? How did they compare with similar tasks on

514 F3II2D02

previous tests {exclude short answer and DBQ)? What were yourfirst thoughts about how your students would handle these tasks?

4. Please describe your experience in scoring the test.

What portion(s) did you score and how was that decided?How long did the scoring take? What, if any points of contention,uncertainty, or debate arose among the scorers? If any problemsarose, how were they resolved?

5. Please describe how your students fared on the test.

Probes; What was the district passing score (i.e., 55 or 65)? Whatwas the passing rate of your students? Of the district students?How did these rates compare with your expectations? Whatfeedback have you received on these rates from peers, students,administrators, parents, community members? What recourse didstudents have who did not pass the test on the first administration?

6. Please describe your current thoughts about the test, especially inlight of your teaching practice.

Probes: Based on your experiences and those of your students withthe new test, what changes, if any, have you made/anticipatemaking in the content you teach, the instructional activities/materials you use, and/or the in-class assessments you design?What changes, if any, do you anticipate making in yourpreparations for the next test? What is your sense of the test as avehicle for raising educational standards for your students?

AuthorsS.C. GRANT is Associate Professor in Department of Learning andInstruction at the University at Buffalo, New York 14260. JILL M.GRADWELL is a Doctoral Student in the Department of Learning andInstruction at the University- at Buffalo, New York 14260. ANN MARIELAURICELLA is Associate Director at the Teacher Education Instituteat the University at Buffalo, New York 14260. ALISON DERME-INSINNA is a Teacher at North Tonawanda High Schooi, NorthTonawanda, New York 14120. LYNN PULLANO is a Teacher in theCatholic Diocese Schools, Buffalo, New York 14203. KATHRYNTZETZO is a Teacher in the WilliamsvUle Schools, Williamsville, NewYork 14051.

Fall 2002 515