MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT...

65
MEASURING WHAT MATTERS Using Assessment and Accountability to Improve Student Learning A STATEMENT BY THE RESEARCH AND POLICY COMMITTEE OF THE COMMITTEE FOR ECONOMIC DEVELOPMENT

Transcript of MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT...

Page 1: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

MEASURING WHAT MATTERSUsing Assessment and Accountability to Improve StudentLearning

A STATEMENT BY THE RESEARCH AND POLICY COMMITTEE OF THE COMMITTEE FOR ECONOMIC DEVELOPMENT

Page 2: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

MEASURING WHAT MATTERS

A STATEMENT BY THE RESEARCH AND POLICY COMMITTEE OF THE COMMITTEE FOR ECONOMIC DEVELOPMENT

Using Assessment and Accountability to Improve StudentLearning

Page 3: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

Library of Congress Cataloging-in-Publication Data

Committee for Economic Development. Research and Policy Committee.Measuring what matters : using assessment and accountability to improve student learning / a

statement by the Research and Policy Committee of the Committee for Economic Developmentp. cm.

“December 15, 2000.”Includes bibliographical references.ISBN 0-87186-139-9 1. Educational tests and measurements — United States. 2. Educationalaccountability — United States. 3. School improvement programs — United States. I. Title.

LB3051 .C632 2001371.26'0973 — dc21

2001017186

First printing in bound-book form: 2001Paperback: $15.00Printed in the United States of AmericaDesign: Rowe Design Group

COMMITTEE FOR ECONOMIC DEVELOPMENT477 Madison Avenue, New York, N.Y. 10022(212) 688-2063

2000 L Street, N.W., Suite 700, Washington, D.C. 20036(202) 296-5860

www.ced.org

Page 4: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

iii

CONTENTS

RESPONSIBILITY FOR CED STATEMENTS ON NATIONAL POLICY iv

PURPOSE OF THIS STATEMENT viii

CHAPTER 1: INTRODUCTION AND SUMMARY 1

CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5Issues in designing direct measures of student learning 5

Setting standards for learning 5Designing assessments 7Reporting scores 9

Challenges to measurement practices 10Appropriate test use 10Testing all students 12Cost implications 13

Limitations of tests as measures of student learning 14

CHAPTER 3: USING ACHIEVEMENT MEASURES TO IMPROVE TEACHING AND LEARNING 15Measurement that enhances instruction 15Teacher capacity 17

CHAPTER 4: USING ACHIEVEMENT MEASURES TO HOLD STUDENTS AND EDUCATORS ACCOUNTABLE 19

Using performance measures to reward and sanction students 21Using multiple measures 21Availability of appropriate instruction and assistance 23Motivating students to do their best 24

Using performance measures to reward and sanction educators 25Accountability for individual teachers 25Accountability for schools 28

Guidelines for good practice 30

CHAPTER 5: MONITORING EDUCATIONAL PROGRESS AND THE MEASUREMENT SYSTEM ITSELF 32

National and international assessments 32Reporting to the public 33Monitoring the measurement system 35

ENDNOTES 37

REFERENCES 39

MEMORANDA OF COMMENT, RESERVATION, OR DISSENT 42

OBJECTIVES OF THE COMMITTEE FOR ECONOMIC DEVELOPMENT 45

Page 5: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

The Committee for Economic Develop-ment is an independent research and policyorganization of some 250 business leadersand educators. CED is nonprofit, nonparti-san, and nonpolitical. Its purpose is to pro-pose policies that bring about steady eco-nomic growth at high employment andreasonably stable prices, increased productiv-ity and living standards, greater and moreequal opportunity for every citizen, and animproved quality of life for all.

All CED policy recommendations musthave the approval of trustees on the Researchand Policy Committee. This committee is di-rected under the bylaws, which emphasizethat “all research is to be thoroughly objec-tive in character, and the approach in eachinstance is to be from the standpoint of thegeneral welfare and not from that of anyspecial political or economic group.” Thecommittee is aided by a Research AdvisoryBoard of leading social scientists and by asmall permanent professional staff.

The Research and Policy Committee doesnot attempt to pass judgment on any pend-

ing specific legislative proposals; its purpose isto urge careful consideration of the objectivesset forth in this statement and of the best meansof accomplishing those objectives.

Each statement is preceded by extensivediscussions, meetings, and exchange of memo-randa. The research is undertaken by a sub-committee, assisted by advisors chosen for theircompetence in the field under study.

The full Research and Policy Committeeparticipates in the drafting of recommenda-tions. Likewise, the trustees on the draftingsubcommittee vote to approve or disapprove apolicy statement, and they share with theResearch and Policy Committee the privilegeof submitting individual comments for publi-cation.

The recommendations presented herein arethose of the trustee members of the Research andPolicy Committee and the responsible subcom-mittee. They are not necessarily endorsed by othertrustees or by non-trustee subcommittee members,advisors, contributors, staff members, or othersassociated with CED.

RESPONSIBILITY FOR CED STATEMENTS ON NATIONAL POLICY

iv

Page 6: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

RESEARCH AND POLICY COMMITTEE

Co-ChairmenPATRICK W. GROSS, Founder and

Chairman, Executive CommitteeAmerican Management Systems, Inc.BRUCE K. MACLAURY, President

EmeritusThe Brookings Institution

Vice ChairmenIAN ARNOF, Retired ChairmanBank One, Louisiana, N.A.ROY J. BOSTOCK, ChairmanB/com3 Group, Inc.CLIFTON R. WHARTON, JR., Former

Chairman and Chief Executive OfficerTIAA-CREF

REX D. ADAMS, DeanThe Fuqua School of BusinessDuke UniversityALAN BELZER, Retired President and

Chief Operating OfficerAlliedSignal Inc.PETER A. BENOLIEL, Chairman,

Executive CommitteeQuaker Chemical CorporationFLETCHER L. BYROM, President and

Chief Executive OfficerMICASU CorporationDONALD R. CALDWELL, Chairman and

Chief Executive OfficerCross Atlantic Capital PartnersROBERT B. CATELL, Chairman and Chief

Executive OfficerKeySpan CorporationJOHN B. CAVE, PrincipalAvenir Group, Inc.CAROLYN CHIN, Chief Executive OfficerCebizA. W. CLAUSEN, Retired Chairman and

Chief Executive OfficerBankAmerica CorporationJOHN L. CLENDENIN, Retired ChairmanBellSouth CorporationGEORGE H. CONRADES, Chairman and

Chief Executive OfficerAkamai Technologies, Inc.KATHLEEN B. COOPER, Chief Economist

and Manager, Economics & EnergyDivision

Exxon Mobil Corporation

RONALD R. DAVENPORT, Chairman ofthe Board

Sheridan Broadcasting CorporationJOHN DIEBOLD, ChairmanJohn Diebold IncorporatedFRANK P. DOYLE, Retired Executive

Vice PresidentGET.J. DERMOT DUNPHY, Retired

ChairmanSealed Air CorporationCHRISTOPHER D. EARL, Managing

DirectorPerseus Capital LLCW. D. EBERLE, ChairmanManchester Associates, Ltd.EDMUND B. FITZGERALD, Managing

DirectorWoodmont AssociatesHARRY L. FREEMAN, ChairThe Mark Twain InstituteRAYMOND V. GILMARTIN, Chairman,

President and Chief Executive OfficerMerck & Co., Inc.BARBARA B. GROGAN, PresidentWestern Industrial ContractorsRICHARD W. HANSELMAN, ChairmanHealth Net Inc.RODERICK M. HILLS, PresidentHills Enterprises, Ltd.MATINA S. HORNER, Executive

Vice PresidentTIAA-CREFH.V. JONES, Office Managing DirectorKorn/Ferry International, Inc.EDWARD A. KANGAS, Chairman,

RetiredDeloitte Touche TohmatsuJOSEPH E. KASPUTYS, Chairman,

President and Chief Executive OfficerThomson Financial PrimarkCHARLES E.M. KOLB, PresidentCommittee for Economic DevelopmentALLEN J. KROWE, Retired

Vice ChairmanTexaco Inc.CHARLES R. LEE, Chairman and

Co-Chief Executive OfficerVerizon Communications

ALONZO L. MCDONALD, Chairmanand Chief Executive Officer

Avenir Group, Inc.NICHOLAS G. MOORE, ChairmanPricewaterhouseCoopersJOHN D. ONG, Chairman EmeritusThe BFGoodrich CompanySTEFFEN E. PALKO, Vice Chairman

and PresidentCross Timbers Oil CompanyCAROL J. PARRY, Retired Executive

Vice PresidentThe Chase Manhattan CorporationVICTOR A. PELSON, Senior AdvisorWarburg Dillon Read LLCPETER G. PETERSON, ChairmanThe Blackstone GroupS. LAWRENCE PRENDERGAST,

Executive Vice President of FinanceLaBranche & Co.NED REGAN, PresidentBaruch CollegeJAMES Q. RIORDANQuentin Partners Inc.MICHAEL I. ROTH, Chairman and

Chief Executive OfficerThe MONY Group Inc.LANDON H. ROWLAND, Chairman,

President and Chief Executive OfficerStillwell Financial Inc.GEORGE RUPP, PresidentColumbia UniversityHENRY B. SCHACHT, Director and

Senior AdvisorE.M. Warburg, Pincus & Co., Inc. LLCROCCO C. SICILIANOBeverly Hills, CaliforniaMATTHEW J. STOVER, Presidentedu.comALAIR A. TOWNSEND, PublisherCrain's New York BusinessARNOLD R. WEBER, President EmeritusNorthwestern UniversityJOSH S. WESTON, Honorary ChairmanAutomatic Data Processing, Inc.DOLORES D. WHARTON, Chairman and

Chief Executive OfficerThe Fund for Corporate Initiatives, Inc.MARTIN B. ZIMMERMAN, Vice

President, Governmental AffairsFord Motor Company

*Voted to approve the policy statement but submitted memoranda of comment, reservation, or dissent. See page 42.

v

*

Page 7: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

SUBCOMMITTEE ON EDUCATION POLICY

*Voted to approve the policy statement but submitted memoranda of comment, reservation, or dissent. See page 42.

†Disapproved publication of this statement. See page 42.

Co-ChairmenROY J. BOSTOCKChairmanB/com3 Group, Inc.EDWARD B. RUST, JR.Chairman and Chief Executive OfficerState Farm Insurance Companies

Trustees

HENRY P. BECTONPresident and General ManagerWGBH Educational FoundationPETER A. BENOLIELChairman, Executive CommitteeQuaker Chemical CorporationJON A. BOSCIAPresident and Chief Executive OfficerLincoln National CorporationJOHN BRADEMASPresident EmeritusNew York UniversityWILLIAM E. BROCKChairmanBridges LearningSystems, Inc.THOMAS J. BUCKHOLTZExecutive Vice PresidentBeyond Insight CorporationJOHN B. CAVEPrincipalAvenir Group, Inc.FERDINAND COLLOREDO-MANSFELDChairman and Chief Executive OfficerCabot Industrial TrustKATHLEEN FELDSTEINPresidentEconomics Studies, Inc.JOSEPH GANTZPartnerGG Capital, LLCCAROL R. GOLDBERGPresidentThe Avcar Group, Ltd.ROSEMARIE B. GRECOPrincipalGreco VenturesBARBARA B. GROGANPresidentWestern Industrial Contractors

JEROME H. GROSSMANChairman and Chief Executive

OfficerLion Gate Management CorporationJUDITH H. HAMILTONPresident and Chief Executive OfficerClassroom ConnectWILLIAM A. HASELTINEChairman and Chief Executive OfficerHuman Genome Sciences, Inc.HEATHER R. HIGGINSPresidentRandolph FoundationDAVID KEARNSChairman EmeritusNew American SchoolCOLETTE MAHONEYPresident EmeritusMarymount Manhattan CollegeELLEN R. MARRAMPartnerNorth Castle PartnersDIANA S. NATALICIOPresidentThe University of Texas at El PasoJOHN D. ONGChairman EmeritusThe BFGoodrich CompanySTEFFEN E. PALKOVice Chairman and PresidentCross Timbers Oil CompanyARNOLD B. POLLARDPresident and Chief Executive OfficerChief Executive GroupS. LAWRENCE PRENDERGASTExecutive Vice President of FinanceLaBranche & Co.HUGH B. PRICEPresident and Chief Executive OfficerNational Urban LeagueE. B. ROBINSON, JR.Chairman EmeritusDeposit Guaranty CorporationROY ROMERFormer Governor of ColoradoSuperintendent of the Los Angeles

Unified School District

HOWARD M. ROSENKRANTZChief Executive OfficerGrey Flannel AuctionsMICHAEL I. ROTHChairman and Chief Executive OfficerThe MONY Group Inc.NEIL L. RUDENSTINEPresidentHarvard UniversityWILLIAM RUDERPresidentWilliam Ruder IncorporatedALAN G. SPOONManaging General PartnerPolaris Venture PartnersDONALD M. STEWARTPresidentThe Chicago Community TrustROBERT C. WAGGONERChief Executive OfficerBurrelle’s Information Services LLCMICHAEL W. WICKHAMChairman and Chief Executive OfficerRoadway Express, Inc.KURT YEAGERPresident and Chief Executive OfficerElectric Power Research Institute

Ex-Officio MembersFRANK P. DOYLERetired Executive Vice PresidentGEPATRICK W. GROSSFounder & Chairman, Executive

CommitteeAmerican Management Systems, Inc.CHARLES E.M. KOLBPresidentCommittee for Economic DevelopmentBRUCE K. MACLAURYPresident EmeritusThe Brookings InstitutionJOSH S. WESTONHonorary ChairmanAutomatic Data Processing, Inc.

vi

*

Page 8: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

SPECIAL GUESTSAND ADVISORSMYLES A. CANEChairman EmeritusSkidmore CollegeSAUL COOPERMANPresidentCitizens for Better SchoolsSEYMOUR FLIEGELPresidentCenter for Educational InnovationHELEN F. LADDProfessor of Public Policy Studies and

EconomicsSanford Institute of Public PolicyDuke UniversityCHRISTINA LAURIAVice President, Pricing AdministrationRoadway Express, Inc.LAWRENCE T. PISTELLSummit, New JerseyANDREW C. PORTERProfessor of Educational PsychologyUniversity of Wisconsin at MadisonDIANE RAVITCHResearch ProfessorNew York UniversityNonresident Senior FellowThe Brookings Institution

PROJECTDIRECTORJANET HANSENVice President and Director of

Education StudiesCommittee for Economic Development

PROJECTASSOCIATESETH TUROFFResearch AssociateCommittee for Economic Development

vii

Page 9: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

Education has always been important toAmericans, but for the last 20 years questionsabout the quality of the nation’s schools havebeen debated with particular urgency. Policymakers, business leaders, parents, and manyeducators have argued that major changes arenecessary to ensure that our children are pre-pared for the civic, economic, and social worldthat lies ahead of them. Key among thesechanges is Putting Learning First (as CED titledits last report on education): setting high aca-demic standards and holding schools account-able for helping students reach these standards.

Schools cannot be re-oriented toward per-formance, however, unless they know what theyare trying to accomplish and can measureprogress toward these goals. As this statementargues, measuring student achievement is es-sential for effective school reform. CED Trust-ees, accustomed to managing complex organi-zations, are convinced that we cannot improvewhat we do not measure. Thus we stronglysupport efforts underway in virtually every stateto specify academic standards and measureimprovements in student learning. We wel-come the spotlight these efforts shine on howwell schools are serving all students, includingstudents whose educational needs were toooften neglected in the past.

At the same time, we recognize that assess-ment and accountability systems capable ofdriving school improvement are still in theirformative stages. They are not perfect, andthey sometimes have unintended consequencesthat rightly concern parents and educators.There is more to learn about how best to de-sign and implement such systems. This reportaims both to show why testing and account-ability are indispensable and to explore theirresponsible use.

Measuring educational performance is notthe same as improving it. The latter requireseffort on a host of fronts, many of which arebeyond the scope of this statement. As CEDhas noted in a series of reports dating back to

the mid-1980s, the development and educa-tion of all children from the earliest stages oftheir lives must be a national priority. Thisrequires better preparing children for school,providing a rich curriculum and ensuring thatevery classroom has a teacher with the prepa-ration and materials to teach it, adopting schoolgovernance and management structures thathighlight performance and encourage account-ability, and giving school-age children the so-cial and health supports that will help themlearn in the classroom.

Educational measurement by itself won’timprove America’s schools. Without it, how-ever, we cannot know how far we’ve come orhow far we have to go to give our children theeducation they deserve.

ACKNOWLEDGMENTSWe would like to thank the dedicated group

of CED Trustees, advisors, and guests whoserved on the subcommittee that prepared thisreport (see page vi). Special thanks go to thesubcommittee co-chairs, Roy J. Bostock, Chair-man of B/com3 Group, Inc., and Edward B.Rust, Jr., Chairman and CEO of State FarmInsurance Companies, who guided the projectwith skill and insight. We are grateful toB/com3 Group, Inc., the GE Fund, The GeorgeGund Foundation, the Keyspan Foundation,the Smith Richardson Foundation, Inc., andState Farm Insurance Companies for theirsupport of this project. We are also gratefulto project director Janet S. Hansen, VicePresident and Director of Education Studies atCED, and Van Doorn Ooms, CED’s SeniorVice President and Director of Research, fortheir contributions, and to Seth Turoff forresearch assistance.

Patrick W. GrossBruce K. MacLauryCo-ChairsCED Research and Policy Committee

PURPOSE OF THIS STATEMENT

viii

Page 10: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

1

Chapter 1

INTRODUCTION ANDSUMMARY

America cannot get the schools it needswithout solid measures of academic achieve-ment. The public—vitally interested in the qual-ity of education available to its children—strongly supports testing. (See Figure 1.) Atthe same time, there is debate over the in-creased emphasis being placed on tests andtest results. At one extreme, critics allege thattests are harming rather than help-ing students; some would behappy to eliminate tests. At theother extreme, test advocatessometimes seem too ready to relyuncritically on test scores in mak-ing important educational deci-sions. Our purpose in this policystatement is to show that tests areimportant and that there are re-sponsible ways to use them.

Public scrutiny of testing ishealthy and contributes to im-proved policies and practices.Amidst “the wonderful cacophonyof a free people disagreeing,”1

however, we must not lose sight ofa key fact: measuring student achieve-ment is an essential element of effec-tive school reform. As business leaders,we know that we can’t improve whatwe don’t measure. Tests are vital toolsfor managing and evaluating effortsto ensure that all children receive ahigh-quality education that preparesthem for college, for the workplace, forparticipation in the nation’s civic life,and for lifelong learning to keep up

with the rapid pace of change in the 21st century. Thedebate over testing should not be about whether to relyon tests but about how best to improve and use themto enhance educational outcomes.

In our 1994 policy statement Putting Learn-ing First: Governing and Managing the Schools forHigh Achievement,2 CED argued that the pri-mary mission of the public schools should be

* The remaining percentage of people either said they don’t know or refused torespond to the question.

SOURCE: Belden Russonello & Stewart and Research/Strategy/Management(2000:34).

Figure 1

The American Public Stands Firmly Behind Tests

Percent of People Who Feel There is Not Enough, Too Much, or Justthe Right Amount of Standardized Testing in Their Community*

Elementaryand Middle

School

High School

50

32

9

12

21

44

0% 20% 40% 60% 80% 100%

� Not Enough

� Right Amount

� Too Much

Page 11: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

2

Measuring What Matters

learning and achievement. We pointed outthat there can be no lasting improvement ineducational achievement until those who gov-ern the system change the way schools are or-ganized and managed to focus on this mission.Instead of emphasizing compliance with regu-lations, there should be greater reliance oncreating effective incentives for principals,teachers, and students to raise academicachievement. Incentives (with consequences forboth successful and unsuccessful performance)should be accompanied by greater flexibilityfor principals and teachers to choose how theywill pursue improved learning. In short, educa-tion needs to be transformed into a perfor-mance-oriented enterprise, rather than onefocused on inputs and rules.

Transforming schools to focus on perfor-mance requires the ability to measure whatstudents know and how well schools are suc-ceeding in meeting the goal of improving aca-demic achievement. Schools should solidlyground all students in language and mathemati-cal skills and provide them with a broad base ofknowledge in subjects such as literature, sci-ence, foreign languages, history, social sciences,and the arts. Students should be able to useand apply this knowledge. Without standardsthat articulate expectations for students andmeasures of their performance on the stan-dards, we have no way of gauging the success orfailure of our educational system.

Reorienting schools toward performance isoccurring through standards-based reform aswell as through new governance arrangementslike charter and contract schools that requireperformance data for purposes of public ac-countability and parental choice. These devel-opments have caused measurement and testuse to receive unprecedented attention. Manystates and school districts have implementedor are working on new assessment and account-ability systems relying on more challenging testslinked to academic standards.

There is still more to learn, though, abouthow to design and use assessment and test-

based accountability systems most effectively tospur school improvement. Public discussionabout tests and test use can be hampered bythe fact that educational measurement is com-plex and sometimes highly technical. Thispolicy statement aims to help the public par-ticipate in efforts to improve measurement bypresenting key issues, identifying the benefi-cial purposes measurement can serve and theunintended consequences that can occur, andexploring how undesirable consequences mightbe avoided.

Our central recommendation is that testsshould be used and improved now—rather thanresisted until they are perfect—because theyprovide the best means of charting our progresstoward the goal of improved academic achieve-ment.*

We recognize that tests are imperfect andincomplete measures of student learning. Wealso acknowledge that they do not measure allimportant aspects of schools’ educational mis-sions. In earlier CED reports,3 we have pointedout that both the regular and the “invisible”curriculum of schools should foster traits inaddition to achievement that are needed forsuccess in the adult world: good work habits,teamwork, perseverance, honesty, self-reliance,and consideration for others.

We also recognize that tests are not them-selves a strategy for improving education. In-stead, with thoughtful attention to theirstrengths and weaknesses, tests serve as vitaltools for measuring the paramount outcomeof education—academic achievement. Theyprovide information that permits those respon-sible for governing and managing schools toreflect on the results of the daily interactionbetween teachers and students in classrooms—where the real work of learning occurs—andask whether the results are satisfactory, whatthings need changing, and whether changesonce made are having their intended effects.

Specifically, educational tests along withother measures of performance can contrib-ute to raising educational achievement in threeways:

*See memorandum by CAROL J. PARRY (page 43).

Page 12: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

3

Chapter 1: Introduction and Summary

• They can help improve teaching and learning,both by guiding day-to-day instruction andby enabling researchers to evaluate the ef-fects of different educational reforms, sothat effective ones can be further imple-mented and ineffective ones discarded.

• They can provide a means of holding studentsand educators accountable for improving educa-tional outcomes, relying on incentives ratherthan rules to spur achievement. They cancontribute to equalizing educational oppor-tunity, by focusing attention on childrenwho have for too long been left behind byAmerica’s schools and by creating incen-tives for finding better ways to teach chil-dren for whom traditional approaches havefailed. They can counter the tendency to-ward complacency about the performanceof the nation’s best students, by providinginformation on whether these students arestanding still or improving and on how theymeasure up against the top students in othercountries.

• They can provide a means for monitoring theprogress of the educational system and for report-ing to the public. They help parents and thepublic become better “consumers” of theireducational systems by helping them deter-mine how well students are learning andwhether reforms are making a difference inimproving the quality of America’s schools.

Despite these benefits of educational mea-surement, new assessment and accountabilitysystems have proven controversial. To someextent, this is inevitable. New assessment sys-tems deliver painful information about perfor-mance to some teachers and students. Parentsdon’t always welcome the news that theirchildren’s academic achievement is weak. More-over, standardized testing, a key (but not theonly) element of educational measurement,has long been a political lightning rod for asmall but vocal anti-testing community. Its mem-bers argue that tests may be (and indeed in thepast sometimes have been) used in ways havingthe effect of denying educational opportuni-

ties to some groups of children (minorities,students with disabilities, and children who arenot native English speakers, for example).

While public support for tests remainsstrong, it could be undermined by unintendednegative consequences of new assessment poli-cies: for example, if test-based accountabilitysystems lead to narrow test-based coachingrather than rich instruction, if they fail to cre-ate incentives for improving the performanceof all students, or if they rely too heavily ontests to make important decisions about indi-vidual students and teachers. Parents and teach-ers could lose faith in education reform itself ifnew performance measures identify studentswho need additional help to succeed academi-cally, but then policy makers fail to provide thisassistance.

Measuring student achievement will, we believe,contribute importantly to improved learning if policymakers, educators, and the public keep the followingin mind:

• Tests are a means, not an end, in school reform.Real educational improvement requireschanging what goes on in classrooms. Policymakers must do more than just identify whatstudents know and can do. They have totackle the much tougher job of helping edu-cators address inadequacies in student learn-ing and overcome conditions that stand inthe way of high academic achievement.

• Assessment and accountability systems are worksin progress and must be continuously reviewedand improved. Educational standards onwhich assessment systems are based are notyet uniformly rigorous and substantive.There is more work to do in designing as-sessment instruments that can measure therich array of knowledge and skills embed-ded in rigorous and substantive standardsand that can accurately portray the perfor-mance of students with special educationalneeds, such as bilingual students and stu-dents with disabilities. As accountability re-quirements focus schools’ attention onmeasurable student progress, they can have

Page 13: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

4

unintended consequences such as teachersemphasizing test preparation instead of thebroader curriculum. We need to know moreabout how to design accountability systemsand about the impact of specific design fea-tures. Standards, assessment, and account-ability provisions need to be regularlyreviewed, using independent evaluators tohelp identify problems and best practicesand to monitor the intended and unin-tended consequences of policy changes.

• A performance-based educational system built onmeasuring student achievement can’t be con-structed on the cheap. Such a system requiresgood measurements and test administrationprocedures, information systems that makeresults available to educators in useful for-mats, training in how to use performance

Measuring What Matters

data to improve instructional practices, andassistance for students and schools whomtests show to be poor performers. Despitethe attention to assessment and account-ability in recent years, most states still have along way to go in implementing standards-based testing systems in all grades and ma-jor subjects. Few have either the assessmentor data systems that would allow individualstudents’ performance to be tracked overtime and their progress measured from yearto year. Without greater investments in suchsystems policy makers will find that statetesting systems are not as useful as theycould be in helping teachers improve theirinstructional performance and in holdingeducators accountable for the performanceof students in their charge.*

*See memorandum by CAROL J. PARRY (page 43).

Page 14: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

5

Chapter 2

MEASURING STUDENTACHIEVEMENT

Tests are not the only way of obtaining in-formation on student and school performance.In the past, student performance was oftenjudged mainly by indirect measures that served(or were thought to serve) as general indica-tors of academic accomplishment: high schoolgraduation, for example, or enrollment inadvanced courses and in college. Similarly, theperformance of schools or districts was fre-quently indicated by indirect measures such asgraduation and dropout rates and teacher andstudent attendance rates.

Transforming education to a performance-oriented system emphasizing academic achieve-ment, however, has vastly increased the interestand attention being given to direct measuresof student learning: i.e., tests. Tests, and theaccountability systems based on them, are notnew and indeed have figured prominently inearlier reform efforts. Never before, however,has the interest in them been so pervasive.4

Much of the current attention to testingderives from the standards-based systemic re-form movement. Spurred by the first-ever Edu-cation Summit between the President and thenation’s governors in 1989 which set results-oriented national education goals,5 standards-based reform has dominated education policymaking at both the state and federal level forroughly a decade. The strategy underlying thisapproach to school improvement begins withdefining standards for student performancethat create high expectations for all students.Linking assessments to these standards, it isargued, will then provide a means for parents,

students, educators, and the public to monitorperformance against the standards. Holdingschools accountable for meeting the standardsshould create incentives for schools to makeinstructional changes to boost student perfor-mance, while giving educators flexibility to de-cide for themselves what instructional andstructural changes are needed (rather thanimposing one-size-fits-all solutions from above).

As background for considering how testscan and should be used to foster educationalimprovement (as we do in Chapters 3, 4, and5), it is helpful to identify basic issues in design-ing measures of student achievement and tooutline challenges posed by the new emphasisbeing given to tests as a reform tool.

ISSUES IN DESIGNING DIRECTMEASURES OF STUDENTLEARNING

Setting standards for learningMeasuring student learning requires defin-

ing what students need to know and be able todo. The 1990s saw the first widespread effortsto develop content standards for academicachievement. CED’s earlier hope6 that rigor-ous, substantive national standards and relatedassessments would be the foundation for stateand local curriculum and instructional reformhas not been realized. The tradition of localcontrol is still strong in education. We under-estimated the political potency of local controland the importance of involving many con-

Page 15: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

6

Figure 2

Standards and Assessments at the State Level

0 10 20 30 40 50

Adopted standardsin at least one

subject

Adopted standardsin English, math,

social studies,and science

Assessments alignedwith standards in at

least one subject

Assessments alignedwith standards in

English, math,social studies,

and science

SOURCE: Education Week (2000:64, 2001:94-95).

� 1997–1998

� 1999–2000

� 2000–2001

stituencies in a consultative approach to stan-dard-setting. Thus, the standards that are driv-ing reform are the product of state-by-statestandard-setting activities involving subjectmatter experts, educators, and representativesof the public.

The good news is that all states but one(Iowa) have or are developing content stan-dards in some or all of four core academicareas (English, mathematics, science, and so-cial studies or history). (See Figure 2.) Anearly tendency in many states to create “a diz-zying array of fuzzy, nonacademic goals thatare overly subjective and highly controversial”7

has given way to more focused attention onacademic achievement.

This does not mean that standard-settinghas become noncontroversial or the resultsuniform. Education is inherently controver-sial, because it is “as much concerned withcentral public values as it is with schools per se,and central values that Ameri-cans hold dear may conflict.”8

Differences in values are of-ten reflected in raucous de-bates over standards that canroil a state politically: witnessthe recent decision by theKansas State Board of Educa-tion to omit mention of evo-lution and the “big bang”theory of the origin of theuniverse from the state’s sci-ence standards and assess-ments, leaving local schoolsfree to teach these topics ifthey wish. (That decision be-came a major issue in the2000 election of state boardmembers. As this statementgoes to press, the newly-elected board appears readyto reverse the prior board’saction.)

The results of state stan-dard-setting processes varywidely. “Some standards are

Measuring What Matters

highly specific, spelling out in detail the con-tent knowledge students should demonstrate,whereas others are more general—or vague, ascritics contend. The degree to which standardsare ‘challenging’ also varies, with some statesdemanding much more of their students thanothers.”9

Standards, therefore, are not yet sufficientlyrigorous and substantive. The National Re-search Council (NRC)10 has also pointed outthat standards differ in the degree to whichthey guide policy and practice, some being sogeneral that they make it difficult to make validinferences about student performance or toprovide clear guides for instruction. Some stan-dards are so extensive that teachers cannotaddress them comprehensively, and test de-signers cannot include them all on assessments.Education researcher Paul Hill argues thatmany states initially used a “logrolling” processfor setting standards, resulting in requirements

38

45 4

36

35 6 3

2417

Page 16: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

7

that reflected the aspirations of advocates forparticular academic specialties. Now many areseriously grappling with the question of “whatis an externally valid standard (i.e., one that isclosely linked to an important life outcome forstudents).”11

Setting standards on a state-by-state basis isan incremental and slow process. Despite theslow pace, we are optimistic that if policy mak-ers and the public stay the course, rigorousand clear standards can be achieved. To insurethat standards can effectively undergird mea-surement systems and instructional improve-ments, we recommend that states (and districtswhere applicable) periodically review their stan-dards for clarity, focus, rigor, and validity, us-ing independent outside reviewers to helpbenchmark against other states and againstmodels of good practice. The results of out-side reviews should be readily available to thepublic. In this way, the process of setting stan-dards will not only improve assessment andinstruction but also provide a continuing frame-work within which citizens can discuss and, ifnecessary, revise their goals for public school-ing.

Designing assessmentsAssessments that are intended to measure

academic achievement should meet three keycriteria: they should be valid, reliable, and fairmeasures of the student learning they seek todescribe. (See box: Criteria for Evaluating Tests:Validity, Reliability, Equity.) Tests that are notvalid, reliable, and fair will obviously be inaccu-rate indicators of the academic achievement ofstudents and can lead to wrong decisions beingmade about students and schools.

Public discussion about the quality of testsis often hampered by a semantic confusion.Test critics often decry “standardized tests,”alleging that such tests are invalid and unfair—that they are incapable of capturing criticaldifferences in student achievement, reflect rotelearning rather than measuring higher-orderthinking skills, and lend themselves to being“gamed.” What these critics are actually cri-

Chapter 2: Measuring Student Achievement

CRITERIA FOR EVALUATING TESTS:VALIDITY, RELIABILITY, EQUITY

Tests should be judged by three key criteria:

• Validity: the degree to which a test measureswhat it was intended to measure.

• Reliability: the consistency and dependabil-ity of assessment results.

• Equity: fairness, lack of bias toward anyparticular group of examinees.

SOURCE: Klein and Hamilton (1999:4).

tiquing are multiple-choice or “fill in thebubble” tests, which they appear to equate with“standardized tests.” In fact, large-scale stan-dardized tests are tests that are administered tostudents from many schools under uniformconditions, as opposed to “teacher-made tests”administered in individual classrooms. “[E]vena written examination, one that is scored byteachers or other human judges and not bymachine, is considered standardized if all stu-dents respond to the same (or nearly the same)questions and take the examination under simi-lar conditions.”12 Thus, any test intended tomeasure academic achievement across multipleclassrooms and schools will be “standardized”if the results are intended to be comparable.

The often unspoken assumption of stan-dardized-test critics—that teacher-made testsare immune from the weaknesses attributed tostandardized and multiple-choice tests—is notborne out by research.13 Moreover, for large-scale testing like that required by standards-based reform, the multiple-choice format hassome advantages. Multiple-choice tests can bescored by machine more quickly and at lowercost than other forms of assessment (essays,portfolios) which must be graded by hand.They also represent the form of assessmentwith which experts in assessment and test de-velopment have the most experience and thegreatest confidence in their ability to maketests that are valid, reliable, and fair. Reliablescores for individual students can be achieved

Page 17: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

8

Measuring What Matters

in shorter amounts of testing time when mul-tiple-choice tests rather than other test formatsare used.

Nevertheless, standards-based reform, byrequiring measures that can capture the richarray of knowledge and skills embedded innew content standards and that can be used tohold students and educators accountable, hasencouraged wider experimentation with newforms of standardized testing. States and dis-tricts are increasingly using new test formats,some considered more “authentic” or betteraligned with ambitious new goals for educa-tion than fill-in-the-blank or multiple-choicequestions. (See Figure 3 and box: There isMore to Tests Than Multiple-Choice Ques-tions.)

Issues of reliability, validity, fairness, andcost-efficiency still pertain, however; andtradeoffs among these criteria are sometimes

Figure 3

Types of Assessment InstrumentsUsed by States, 2000–2001

SOURCE: Education Week (2001:94).

Number of States

Multiple Short Extended Extended Portfolio Choice Answer Response Response

(English) (OtherSubjects)

50

45

40

35

30

25

20

15

10

5

0

72

THERE IS MORE TO TESTS THANMULTIPLE-CHOICE QUESTIONS

Though people tend to equate standardizedtests with multiple-choice questions, tests in-creasingly use other formats as well:

• Constructed response: students write theirown answers, rather than choose among ahandful of pre-selected responses. Con-structed response, or open-ended, testitems might include essays or mathematicalproblems in which the student is requiredto show the steps used in reaching a solu-tion.

• Hands-on performance tasks: students aregiven practical tasks, involving instrumentsand equipment, and are assessed on theircontent and procedural knowledge as wellas their ability to use that knowledge inreasoning and problem solving. They maybe evaluated individually or as part ofteams.

• Portfolios: students are assessed on thebasis of a representative sample of the workthey produce during the school year.

necessary. In some cases, tradeoffs can be ad-dressed through policy. For example, if policymakers are willing to trade off cost efficiency infavor of comparatively expensive assessmentmethods like essay exams or open-ended re-sponse items (to allow for richer and moreextensive content coverage) and to permit suf-ficient testing time to ensure reliability, testdevelopers can produce tests including suchitems.

Sometimes, though, tradeoffs are unavoid-able because of the technical limitations of testdesign. Kentucky and Vermont were leaders inexploring the use of portfolios of student workin an effort to align assessment with instruc-tion and to evaluate students in a way thatreflected the breadth of their academic work.Portfolios proved useful to teachers and schoolsbut not suited to use in accountability systems.

49

38

46

Page 18: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

9

Chapter 2: Measuring Student Achievement

Teachers reported that learning to score port-folios had a strong positive influence on in-struction.14 Reliability was significantly lowerthan with traditional multiple-choice tests, how-ever. Kentucky reintroduced multiple-choiceitems to its state assessment after outside ex-perts found that the items were needed toensure content coverage, score reliability, andstability of proficiency standards over time.15

Vermont, too, uses multiple-choice, short-an-swer, and extended response items as well asportfolios in its state test system.

An NRC committee reflected the view ofmany test experts when it found that “policyand public expectations of testing generallyexceed the technical capacity of the tests them-selves. One of the most common reasons forthis gap is that policy makers, under constitu-ent pressure to improve schools, often decideto use existing tests for purposes for which theywere neither intended nor adequately vali-dated.”16 We recognize that there are othersigns of strain in the assessment system as well,such as the failure of several testing contrac-tors to provide timely and accurate score re-ports to states and districts.

These growing pains seem to us inevitableand remediable aspects of transforming a huge,complex enterprise into a performance-ori-ented endeavor. Like standard-setting itself,building an assessment system capable of meet-ing the new demands being placed on it willtake time, sustained commitment, and money.Nevertheless, creating an assessment systemcapable of supporting schools that focus onoutcomes and performance is an urgent prior-ity. We urge business leaders, parents, and thepublic to provide vigorous support for effortsto build assessment systems that support per-formance-oriented education. Our recommen-dations about standard setting apply toassessment as well. States and districts shouldperiodically review their assessments for reli-ability, validity, and fairness. They should callon independent outside reviewers to help iden-tify and use the best available testing technolo-gies and should keep the public informed ofthe results of these evaluations. Outside review

should also help monitor the intended andunintended consequences of assessment sys-tems, a point we will pursue further in ourdiscussion of accountability.

Reporting scoresTest scores are the most visible aspect of

measurement. They can convey informationabout how students and schools measure upagainst standards, about whether they are mak-ing progress toward meeting standards, andabout how they compare to other students andschools. Too often, in our view, the press andthe public overemphasize the comparisons. Acomplex story gets reduced to rankings and apreoccupation with whose scores are highestor lowest, when what matters more for schoolimprovement efforts is what knowledge andskills students have and whether their achieve-ment is improving over time. Transforming edu-cation to an outcome- and performance-based systemdemands that we look beyond the “horse race” andfocus on the information measurement provides thatcan be used to improve teaching and learning.

Extracting this information from test scoresrequires understanding how assessment resultsare and should be reported.

Traditionally, the results of large-scale test-ing programs have been reported as norms:how well students perform relative to the per-formance of a national sample of students whotook the same test. They result in familiar per-centile or grade-referenced scores. A studentmay be described as performing at the 75th

percentile (that is, better than 75 percent ofthe national sample). Some percentage of stu-dents may be described as scoring at or above aspecified grade level, which reflects not whatstudents in that grade are expected to knowbut how these students performed relative tothe national sample enrolled in the same grade.Large-scale testing was born from a need tomake selection decisions (to make collegeselection decisions, for example, and to iden-tify students for gifted programs). Norm-refer-enced scores were designed to assist in selectiondecisions by spreading out applicants along ascore distribution.

Page 19: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

10

Measuring What Matters

Norm-referenced scores are less good atanswering today’s most pressing question: whatdo students know? An NRC report cites “a com-monly used analogy: norm-referenced scorestell you who is farther up the mountain; theydo not tell you how far anyone has climbed.”17

To extend the analogy, they also provide noinformation about what the climber needed toaccomplish to reach the level reported (i.e.,was the mountain high or low or particularlysteep?). Today in education we want to knowmore than how students and schools compare.We also want to know what students know aboutthings we believe are important. That is, weneed criterion-referenced scores, where thecriteria are educational standards.

The distinction between norm-referencedscores (which emphasize comparisons) andcriterion-referenced scores (which emphasizeperformance against standards) is importantto keep in mind as we pursue our discussionabout using measurement for various purposes.Scores appropriate for one purpose (e.g., norm-referenced scores for informing parents abouthow their children are performing relative toothers) may be quite inappropriate for another(e.g., for holding schools and teachers account-able for improving the performance of theirstudents relative to state standards).

Reporting test scores for groups of studentsand making inferences from them, especiallyfor purposes of accountability, raise additionalconsiderations. Scores that show averageachievement of test takers at one specific timeare less useful for gauging how well particularschools are performing than are scores thattrack changes over time for the same students(or students in the same grade cohort). Com-paring scores or score changes across schoolsand districts without considering differencesin the backgrounds of test takers can lead toinaccurate judgments about which schools anddistricts are really successful at raising studentperformance, since student backgrounds areknown to influence academic achievement.We address these issues in more detail in Chap-ter 4.

CHALLENGES TO MEASUREMENTPRACTICES

The emphasis currently being given to mea-surement as a tool of educational reform height-ens the importance of using tests appropriately,including all students in assessment programs,and acknowledging that new demands on test-ing cannot be met on the cheap.

Appropriate test useIndividual tests are not one-size-fits-all. Pro-

fessional guidelines about the circumstancesin which use of a particular test is appropriatehave existed since the mid 1950s and havebeen revised and expanded several times, mostrecently in 1999.18 Test experts argue, however,that the standards are insufficiently understoodand adhered to by policy makers and test users,especially as they build new test-based account-ability systems.19 We recommend that policymakers and test users respect professionally-developed testing standards when designingand implementing assessment and accountabil-ity systems. We will touch on a number of thesestandards here and in the Chapter 4 discussionof test use for accountability purposes.

A key concern of test experts is that policymakers and the public frequently do not un-derstand that the validity and reliability of atest is determined by the use to which it will beput. Tests that are valid and reliable for onepurpose (to influence classroom practice, forexample) may not be so for another (to holdschools accountable for performance). Thetemptation is nevertheless strong to use testsfor multiple purposes to save money and time.We recommend, in accordance with profes-sional standards, that tests be used only forpurposes for which they have been validated—that is, where the strengths and limitations ofthe testing program and the test itself havebeen evaluated for its intended use.

To be powerful instruments of reform, mosttests should be tied to (“aligned with”) thespecific expectations that states and districtshave for their students. Alignment refers to

Page 20: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

11

Chapter 2: Measuring Student Achievement

how well an assessment measures the contentsand skills laid out in a standard. “Alignmentensures that the tests match the learning goalsembodied in the standards. At the same time,aligned assessments enable the public to deter-mine student progress toward the standards.”20

Judgments about the extent to which tests andstandards are aligned go deeper than “yes” or“no” evaluations and examine both the con-tent that test items are measuring and the depthof performance that a student is supposed todemonstrate.

Alignment is easier to achieve when statesdeliberately design tests to measure their stan-dards, though this does not ensure alignment.Tests still may not measure all of the state’sstandards, particularly in states where the listof standards is extensive. The ability of tests tomeasure standards may be further limited byconstraints imposed by policy makers, such astesting time and costs.

The barriers to alignment are more seriouswhen states use so-called “off-the-shelf” com-mercial tests rather than developing their own.Such tests are designed to be used in manystates. Given the variety of standards across thestates, they are unlikely to line up with thestandards in any one state. In a country thathas no nationally-agreed upon benchmarks forlearning, there is still room for norm-refer-enced tests that gauge comparative standingand for off-the-shelf and other tests that givegeneral information about subject-matterknowledge.

Nevertheless, the purposes we identified formeasurement in the first section of this reportare best served by tests that are aligned tostandards. (See Figure 4.) Otherwise, teachersget no guidance from test scores about howwell their students have mastered the standards,so they do not know whether their instruc-tional programs are working. Accountability

Figure 4

Number of States Using Criterion-Referenced Assessments Aligned to State Standards

SOURCE: Education Week (2001:95).

2000–2001

English Math History Science

50

45

40

35

30

25

20

15

10

5

0

43 42 41

22

17

13

4138 38

19

23 24

� Elementary School

� Middle School

� High School

Page 21: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

12

Measuring What Matters

systems that do not make clear what teachersare expected to teach and students expected tolearn are unfair and may well be found illegalby the courts.

Using tests for “high stakes” purposes suchas denying students promotion or graduationand assigning rewards and sanctions to schoolsraises the stakes for the tests themselves. Whenadverse consequences for individuals may re-sult, tests become subject to the possibility oflegal challenge on the grounds that the waythey are used is discriminatory or otherwiseinappropriate. (See Chapter 4.) Some civilrights groups have long opposed the use oftests, at least when tests are used by themselvesto make important decisions about students orwhen students do not have equal access tohigh-quality instruction.21 Their wariness is un-derstandable; history has given them plenty ofcause for concern. Early in the 20th century,“[i]n their worst manifestations, the uses [oftests] were racist and xenophobic”: prominentscientists used IQ test results to argue that blacksand immigrants from Southern and EasternEurope were mentally inferior.22 Southernschools resisting desegregation used tests toresegregate black students into lower tracks. In1994, publication of The Bell Curve 23 sparked a“highly-charged, racialist debate” over the au-thors’ claim that social and economic inequal-ity among racial and ethnic groups can beexplained by differences in intelligence as mea-sured by tests. Despite detailed critiques of thebook’s methods and analysis, it provided a ra-tionale for those desiring to limit educationand social welfare policies aimed at reducinginequalities.24

If attention is given to using tests that are valid,reliable, and fair, however, we believe that measuresof academic performance can be powerful instru-ments of educational opportunity because of the spot-light they shine on students whose needs have toooften been neglected in the past. As the Citizens’Commission on Civil Rights has noted, “with-out an accurate means of measuring what stu-dents know and can do, responsible schoolauthorities have no way of gauging whether

students are reaching high standards. And with-out such an accurate gauge, schools and schooldistricts cannot be held accountable for re-sults. Accurate assessment tools, then, are theglue that holds the reform effort together.”25

Testing all studentsTest programs must include all students if

they are to contribute to improving educationfor all. In the past, many testing programs ex-empted students with disabilities and studentswho are not yet fluent in English from testingor did not include the scores of such studentsin public reports on district and school perfor-mance. Testing these special needs studentsand accurately interpreting their scores raisesa variety of technical challenges. Nevertheless,special needs students should be included intesting programs to the maximum extentpossible, and district and school performancereports should always include information onany exceptions to this policy.

Special needs students represent a signifi-cant proportion of the school population. In1996-97, almost 13 percent of elementary andsecondary school students were served by fed-eral programs for children with disabilities.26

Seven percent participated in bilingual educa-tion or English as a second language pro-grams.27 Excluding these students from testingsends signals that such students don’t matteror can’t be expected to meet high standardsand that schools aren’t responsible for theiracademic progress. Since these students arenot distributed uniformly across schools anddistricts, the validity of test score comparisonsis compromised when nonstandard policies areused to exempt students from regular assess-ment programs and when schools and districtsfail to provide data about the proportion ofspecial needs students included in their scores.

Federal law now requires that all studentsbe included in assessment and accountabilitymechanisms and that appropriate accommo-dations and modifications be made if neces-sary.28 For example, a braille version of a testmight be provided for a blind student or the

Page 22: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

13

Chapter 2: Measuring Student Achievement

test read aloud to him or her, or a student witha reading disability might be given extra timeto complete a test. A student not yet proficientin English might be given tests (in subjectsother than English itself) in his or her nativelanguage.

States face major challenges in implement-ing the law. There is little research on theeffects of testing accommodations on the valid-ity and reliability of test score information foreither students with disabilities or for English-language learners. It is not clear whether dif-ferent versions of tests in different languagesin fact measure the same thing.29 Indeed, statesreport that the assessment of special popula-tions is among the greatest challenges theyface in developing assessment systems.30

Including all students in assessments is animportant goal, but one which makes unprec-edented demands on testing expertise. It isessential that users of test scores understandthe complexity of the inclusion issue and uti-lize information about who is and isn’t yet cov-ered by testing programs in their interpretationof performance data. States and districts shoulddevelop clear guidelines for accommodationsthat permit special needs students to partici-pate in testing. They should also describe themethods they use to screen students for ac-commodations and should report how oftenstudents receive accommodations or are stillexcluded altogether. Meanwhile, governmentand foundation funding agencies should sup-port research aimed at addressing gaps in ourknowledge about the validity and reliability ofdifferent accommodations and alternate assess-ment practices that can accurately gauge theacademic achievement of those special needsstudents for whom standard testing approachesare unsuited.

Cost implicationsShifting to a performance-based education system

and developing the measurement tools to assess per-formance requires new investment; it cannot be con-structed on the cheap. We have already describedtwo major new demands on testing systems: (1)

creating assessment instruments that arealigned with standards and that can measure arange of skills and (2) including all children inassessment programs. Measuring academicachievement and using the results to improveperformance demands other new investmentsas well: for example, enhanced test securitymeasures for accountability tests; more frequenttesting so that improvement and not justcurrent status can be assessed; linked data sys-tems to support public reporting; and (mostcritical of all) appropriate remediation for low-performing students and assistance for schoolsand teachers to improve their capacity to assistall students in meeting high academic stan-dards.

There is already evidence that the inabilityor unwillingness to make the necessary invest-ments affects how educational performance ismeasured and the results used. Few states haveimplemented standards-based testing systemsthat cover all grades and major subjects; thenorm is to use standards-based tests at selectedgrades in selected subjects. Few have as yetdeveloped student information systems thatenable comparison with matched sets of schoolsor student level data bases that allow students’performance to be tracked over time so perfor-mance changes can be identified. Many reportthat they are unable to provide the follow-upassistance to schools that test results suggest isneeded. (See box: States Lack the Capacity toServe All Schools in Need of Improvement.)The Maryland State Board of Education re-cently delayed the date at which its high schoolgraduation tests will be used to deny diplomasafter the governor and state legislature failedto supply full funding for an academic supportprogram the board believed was needed tohelp students who are behind their peers priorto entering high school.31

We urge supporters of a truly performance-based educational enterprise to acknowledgethe costs involved and to advocate the neces-sary funds to build and implement measure-ment systems that support and hold schoolsaccountable for instructional improvement.

Page 23: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

14

Measuring What Matters

STATES LACK THE CAPACITY TOSERVE ALL SCHOOLS IN NEED OFIMPROVEMENT

A 1998 Department of Education Surveyexamined states’ capacity to follow up on infor-mation about low school performance:

• Only 9 states reported that they could pro-vide support to at least half the schools inneed of improvement in their states.

• 12 states reported that they served less thanhalf of the schools in need of improve-ment.

• 24 states said they had more schools inneed of support services than they had theresources to provide.

SOURCE: U.S. Department of Education (1999b:64).

Reallocation of dollars from lesser priority ac-tivities and more efficient uses of resourcesmay address part of the need but are highlyunlikely to supply all of the investment required.

LIMITATIONS OF TESTS ASMEASURES OF STUDENTLEARNING

Although Americans strongly support test-ing, they do have some concerns about relyingon statewide tests in public schools. Two majorconcerns relate to the limits of tests in accu-rately measuring learning: that some childrenperform poorly on tests and that statewide testscannot measure many important skills that chil-dren should learn.32 (See Figure 5.) Of course,the public is right to recognize these limita-tions. Tests are not perfect measures of indi-

vidual students’ knowledge,nor do they reflect the fullrange of a rich educationalprogram. The issue for policy isnot whether tests are perfect butwhether they help in efforts to im-prove the performance of studentsand schools and whether they con-tribute positively to decisions (likepromotion and graduation) thatare going to be made whether testsexist or not. We strongly believethat they do.

Figure 5

Public Concerns About the Limits of Tests

Some childrenperform poorly

on tests eventhough they know

the material

Statewide testscannot measuremany important

skills childrenshould learn

NOTE: Percentages do not add to 100 due to rounding.

SOURCE: Belden Russonello & Stewart and Research/Strategy/Management(2000:44).

6%10%32%

49%

10%

17%

35%

36%

� Strongly Agree � Somewhat Disagree� Somewhat Agree � Strongly Disagree

Page 24: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

15

Chapter 3

USING ACHIEVEMENT MEASURESTO IMPROVETEACHING AND LEARNING

All aspects of a performance-based educationalsystem, including the measurement system itself,should ultimately be judged by their contribution toteaching and learning.

The growing use of tests for accountabilitypurposes has led to charges that new assess-ment systems hurt educational quality. Criticsargue that overemphasis on tests results in cur-ricular “compression” (slighting subjects thatare not tested) and in “teaching to the test”(for example, spending class time onworksheets and test-prep material that imitatetest questions rather than on richer forms ofinstruction). Such unintended consequencesdo occur, and when they occur they are indeedworrisome. We do not think that curricular com-pression and teaching to the test are inevitable resultsof testing, and we believe that with thoughtful atten-tion they can be avoided.

Moreover, the evidence is mounting thatmeasurement systems designed to enhance in-struction, when accompanied by efforts to giveteachers the capacity to use the results, can bepowerful levers for instructional improvement.Assessment is assisting educators in identifyingand helping students who in earlier periodswould have been left behind by schools thathad no clear idea about how to measureprogress toward their objectives, if indeed theyhad any. However, assessments should be ac-companied by greater efforts to develop theteacher capacity necessary to make full use ofnew information about student learning andtranslate it into improved instruction.

MEASUREMENT THAT ENHANCESINSTRUCTION

Data can be powerful catalysts for instruc-tional change. Creative superintendents andprincipals use the information provided byassessments to spur teachers to identify thestrengths and weaknesses of individual studentsand make plans for addressing problems.Teachers are stimulated to work together onnew curricular models that align lesson plansto content standards and that encourage cur-ricula to flow from subject to subject and fromgrade to grade within schools.33 Good teachersuse test results to help them learn how betterto teach the curriculum, not to teach to the testitself.

State assessments are being used by manyschools to motivate instructional improvement,but these tests are actually not necessarily thepreferred form of measurement for purposesof affecting the everyday interactions betweenteachers and students. Teacher-made tests, di-agnostic tests, and other forms of local assess-ment are better suited to provide some kindsof information crucial for enhancing teaching.Tests used for accountability and monitoringpurposes are designed to provide a range ofinformation about students’ mastery of a givensubject at a given grade level. They are usuallyadministered once a year. They provide toolittle information too infrequently to allowteachers to adjust their instruction to reflectchanging student needs during the course ofthe academic year.

Page 25: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

16

Measuring What Matters

Local assessments can compensate for theseshortcomings. Because they can be given oftenand aren’t used for accountability or to reflectcontent coverage comprehensively, they cango into specific topics in depth and cover sub-jects that may not be on state assessments. Theycan more easily make use of test formats (es-says, open-ended questions, hands-on activi-ties, portfolios) that give students a variety ofways to demonstrate their knowledge and thatare better suited than multiple-choice itemsfor classroom instruction.

Districts that understand the potential ben-efits of measurement can creatively blend localand state assessments to improve instruction. ANational Research Council (NRC) report citestwo examples of districts pursuing standards-based reform that have fashioned “a mosaic ofassessment information that includes frequentassessment of individual student progress atthe classroom level; portfolios and grade con-ferences on student work at the school level;

Community District 2 in New York City beganits reform effort by changing the curriculum,rather than the assessments. The district admin-isters a citywide mathematics and reading test,and a state test as well. Each year, the districtreviews the results, school by school, with princi-pals and the board, setting specific goals forraising performance, especially among the low-est-performing students. In addition, schoolsalso administer additional assessments that theyfound are aligned with the curriculum. In thatway, the intensive staff development aroundcurriculum, which the district has made its hall-mark, and the professional development thedistrict provided on the assessment, produce thesame result: teachers with significantly enhancedknowledge and skills about how to teach stu-dents toward challenging standards.

Schools in Boston also use a multifacetedapproach to assessment. The state of Massachu-

performance assessments at the district level;and standards-referenced tests at the state level.All of these are compiled into reports thatshow important constituencies what they needto know about student performance.”34 (Seebox: How Two Districts Blend Local and StateAssessments to Improve Instruction.)

States, districts, and schools can work to-gether to address concerns over inappropriateteaching to the test, cheating, and other testmisuses that can distort instruction. Encourag-ing careful monitoring of test use by indepen-dent organizations (see Chapter 5) is oneapproach. Another is continued improvementof tests themselves, so that they become some-thing worth teaching to. Meanwhile, researcherRichard Phelps argues that the state testingdirectors, who are technically proficient andindependent of test publishers, can “deploy anumber of relatively simple solutions” to com-bat test misuse: “not revealing the contents oftests beforehand; not using the same test twice;

setts has developed its own test, and the districtuses a commercial test. In addition, schools havedeveloped parallel assessments. One elementaryschool, for example, begins each September byassessing every student, from early childhood tograde 5, using a variety of methods: observationfor young children (through grade 2), runningrecords, writing samples. They repeat the run-ning records and writing samples every four tosix weeks. They examine the data in January andagain in June to determine the children’sprogress. In that way, every teacher can tell youhow her students are doing at any point. Teach-ers can adjust their instructional practices ac-cordingly, and principals have a clear picture ofhow each classroom is performing. The districtand state tests, meanwhile, provide an estimateof each school’s performance for policy makers.

SOURCE: National Research Council (1999d:49-50).

HOW TWO DISTRICTS BLEND LOCAL AND STATE ASSESSMENTS TOIMPROVE INSTRUCTION

Page 26: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

17

Chapter 3: Using Achievement Measures to Improve Teaching and Learning

requiring that non-tested subjects also gettaught (or testing them, too); and maintainingstrict precautions against cheating during testadministrations.”35 Keeping test content secureand not using the same test twice make it futileto try to teach to specific test items. However,these steps drive up the cost of testing, requir-ing that publishers prepare more forms of eachtest, that districts purchase a new form for eachtest administration, and that policies be imple-mented to discourage cheating.36

We are not naïve in thinking either thatsuch good test practices are followed every-where or that some teachers and schools willnot respond to the “horse race” mentality abouttest scores found in the media and among thepublic by focusing instruction in some instancestoo narrowly on tests. We (business leaders, ourcolleagues, and our fellow citizens) contribute to testmisuse when we focus our own attention too nar-rowly or uncritically on test scores as the sole indica-tor of school and student performance. We can bepart of the solution by being informed consumers oftest results. We should pay attention to how tests arelinked to standards and instruction, interpret testscores carefully, focus on the progress of schools andstudents rather than on winners and losers in the“horse race” game, and support building the capacityof school-level educators to use test data well forinstructional improvement.

TEACHER CAPACITYAssessments will only contribute to improved teach-

ing and learning if educators know how to interpretand use test results and how to change their instruc-tional practices to make them more effective. There isa growing realization that along with the standards,assessment, and accountability pieces of standards-based reform, more attention needs to be given tofoster teachers’ capacity to change what they do in theclassroom.37

Standards-based reform makes unprec-edented demands on teachers. It calls on themto learn about rigorous new standards that re-quire them to teach very different material, tounderstand and often participate in the devel-

opment of new curricula to implement thestandards, and to change the way they interactwith students. New findings from research inthe science of learning38 underscore the needfor teachers to develop teaching practices thatare adapted to the diverse learning needs oftheir increasingly heterogeneous students: prac-tices that reflect the prior knowledge, as well asthe skills, attitudes, and beliefs that individuallearners bring to school and that are sensitiveto the cultural and language practices of stu-dents and the effects of those practices onclassroom learning. Finally, teachers are beinggiven unprecedented amounts of data aboutthe achievement of their students from districtand state tests and are being asked to use infor-mation on student outcomes to improve theirteaching.39

All of these new expectations assume thatteachers know how to do these things. Stan-dards-based reform further assumes that teach-ers would institute effective practices if theyhad the freedom and motivation to do so. AnNRC study committee concluded, however, thatthese assumptions might be “overoptimistic”and that “knowledge about how to implementeffective instructional strategies to help all stu-dents learn to challenging standards is alsolargely unknown.”40 Teachers feel unprepared(1) to implement new curriculum and perfor-mance standards and teaching methods, (2) toaddress the special needs of students with dis-abilities or limited English proficiency and thosefrom diverse cultural backgrounds, and (3) tointegrate educational technology into their in-struction.41

Part of the solution is better preparation offuture teachers, especially more emphasis onsubject-matter knowledge. Teachers already inschools, though, will have to be helped throughprofessional development. We recommend thatpolicy makers and educators recognize, as busi-nesses do, that staff development andworkforce retooling are important investmentsin the future. As we pointed out in our 1994report, in education these activities have typi-cally been less well funded than in the private

Page 27: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

18

Measuring What Matters

sector and have been among the first to be cutwhen resources are lean.42

Investments in professional development,though, must result in change in the class-room. As much as possible, professional devel-opment activities should be designed to meetthe needs of the personnel in individual schools.Too often now, professional development var-ies widely in quality and consists of short dis-trict-sponsored workshops, which are not linkedto specific teacher needs or to school improve-ment plans.

Teachers also need more help to under-stand and use measurement as a tool of in-structional improvement. The NRC found thatteacher preparation programs provide littleemphasis on measurement and that teachersfeel inadequately prepared in assessment.43

Thus teachers may not know how to take fulladvantage of the data on student outcomesbeing provided by state and local assessmentsystems and by their own classroom tests. As-sisted by business leaders, several states havedeveloped programs that help teachers andprincipals learn how to use data to improveinstruction. (See box: Helping Teachers Getand Use Performance Data.)

If capacity issues are ignored, test expertswarn that a predictable pattern will develop:test scores, which generally start off at a lowlevel, will “rise quickly for a couple of years,level off for a few more, and then graduallydrop over time.”44 In addition to helping cur-rent teachers improve their instructional prac-tice, the necessary changes may involve other,costly reforms such as class size reductions,after-school and summer school programs forstudents having academic difficulties, univer-sal pre-school for those who want it, and rede-signed teacher training programs as well aspolicies that address health and other prob-lems that interfere with some children’s abilityto learn. Assessments are no magic solution to thechallenge of real educational improvement. They serveas an essential thermometer but cannot by themselvesmake teaching and learning better. Policy makersand educators must attend to the whole range of

In Texas, Just for the Kids (JFTK) wasfounded in 1995 to provide community sup-port for school improvement efforts. JFTKuses data from the state education agency tocalculate (for each elementary school in thestate) an “opportunity gap”: a measure of howthe school compares to the most successfulschools serving equally or more disadvantagedstudent populations. JFTK offers regionaltraining sessions where principals and commu-nity leaders are trained on how to present dataand lead public discussions and provides train-ing sessions for educators on how to use thedata. Since October 1998, JFTK has trainedmore than 1000 campus leadership teamsrepresenting over 250 school districts through-out Texas in how to use the JFTK data to in-crease student achievement on their campus.JFTK plans to extend its data analysis tomiddle and high schools. Internet site:www.just4kids.org.

In Maryland, the Maryland BusinessRoundtable for Education, in collaborationwith the Maryland State Department of Educa-tion, created a “decision support system” website, which allows schools (and others) to ac-cess and analyze state test data, compare per-formance with other schools, and identifyeffective practices for improving student learn-ing. The web site links state standards, dataanalysis, school improvement planning pro-cesses, and resources for instructional en-hancement. Internet site: www.mdk12.org.

The Illinois Business Roundtable hascollaborated with the Illinois State Board ofEducation and the National Central RegionalEducation Laboratory to launch a similareffort in July 2000. Internet site:www.ilsi.isbe.net.

Wisconsin unveiled the Wisconsin Informa-tion Network for Successful Schools in Sep-tember 2000. Internet site: www.dpi.state.wi.us.

HELPING TEACHERS GET AND USEPERFORMANCE DATA

factors that affect teachers’ ability to provide effectiveinstruction and students’ ability to achieve to highstandards.

Page 28: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

19

Chapter 4

USING ACHIEVEMENT MEASURESTO HOLD STUDENTS ANDEDUCATORS ACCOUNTABLE

A key link in the chain of standards-based reformis accountability, which provides a badly-needed in-centive for students and educators to meet the stan-dards. Without meaningful consequences forperformance, the instructional utility of tests is un-likely to be realized. Accountability also providesthe mechanism for freeing schools from re-quirements to follow rules and procedures andinstead making them responsible for results inthe form of improved student learning.

Students and educators can be held account-able for their performance in various ways.Publishing school-level test results and provid-ing data on individual students to parents usethe incentive of information (and perhaps pub-lic recognition for high achievement or publicembarrassment over poor performance) to spurimprovement. Higher stakes accompany ac-countability when tests—or tests and some com-bination of other measures—have more directconsequences; when, for example, they are usedto deny low-performing students promotion tothe next grade or a high school diploma orwhen schools and their staffs are given finan-cial rewards for high or improving studentachievement or threatened with state takeoversor “reconstitution” if student achievement isunacceptably low.

While it can reasonably be argued that“accountability for results is more talk thanaction” at this point, especially for the adults inthe education system, there is no question butthat accountability is “in vogue.”45 More andmore students now face or soon will face newrequirements that they pass tests to be pro-

moted from grade to grade or to graduatefrom high school. Principals and teachers facegrowing pressures to accept contracts tying con-tinued employment or pay to the academicachievement of their students. Policy makershave “jumped aboard the standards-and-ac-countability train,” even if many have beenslower to impose real consequences on schoolsand the teachers and principals who work inthem.46

Accountability systems in most states aretoo new to evaluate their effect on studentachievement. Evidence from two of the pio-neering states—North Carolina and Texas—isvery encouraging about the positive effects ofaccountability systems accompanied by conse-quences for results that are implemented aspart of a coordinated set of reform strategies.(See box: North Carolina and Texas Show Im-provement Using a Coordinated Array of Re-form Strategies.)

Accountability however, exacerbates con-cerns about unintended negative consequencesof measurement which are less likely to occurwhen tests are used for purposes with moreindirect consequences for individuals (such asimproving instruction and monitoring the edu-cational system). When applied unequally (e.g.,to students only and not to educators, or viceversa) accountability raises concerns about fair-ness.47 Some critics argue that the negative con-sequences of testing (grade retention, denialof diplomas) are being felt much more by stu-dents, while rewards rather than sanctions aremore often applied to teachers.

Page 29: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

20

Measuring What Matters

The more direct the consequences for indi-vidual students and schools, the more criticalbecome questions about the technical qualityand fairness of accountability models. Howmuch margin for error is there in students’ testscores and how should this so-called “measure-ment error” affect decisions about promotionand graduation that are based on test scores?Do test-based accountability formulas distin-guish high and low performing schools accu-rately and consistently? Would differentmethods of calculating accountability ratingsresult in different “winners” and “losers” inaccountability systems that reward the top per-formers and sanction the lowest?

More critical, too, becomes the politicalreasonableness of the results. Performance ex-pectations set so high that most students orschools initially fail to reach them may cause

The National Education Goals Panel, a bipar-tisan and intergovernmental body of federaland state officials, was founded after the 1989Education Summit to track state and nationalprogress toward meeting the national educationgoals.

In 1997, the panel reported that two states,North Carolina and Texas, stood out from otherstates in the extent to which they were achievingthese objectives. The panel commissioned anoutside review to confirm the findings and seekto identify the factors that could and could notaccount for these states’ success.

The analysis confirmed that the gains in aca-demic achievement in both states were signifi-cant and sustained. North Carolina and Texasposted the largest average gains in studentscores on tests of the National Assessment ofEducational Progress (NAEP) administeredfrom 1990 to 1997. The results were mirrored instate assessments administered during the sameperiod, and there was evidence that the scoresof disadvantaged students improved more thanthose of advantaged students.

The study concluded that the most plausibleexplanation for the test score gains were to befound in the policy environment established ineach state. Both states pursued a similar set ofpolicies which were consistent with each otherand sustained over time. The main elementsincluded:

• State-wide standards by grade for clear teach-ing objectives

• Holding all students to the same standards

• State-wide assessments closely linked to aca-demic standards

• Accountability systems with consequences forresults

• Increasing local control and flexibility foradministrators and teachers

• Computerized feedback systems and data forcontinuous improvement

• Shifting resources to schools with more disad-vantaged students

SOURCE: Grissmer and Flanagan (1998).

NORTH CAROLINA AND TEXAS SHOW IMPROVEMENT USINGA COORDINATED ARRAY OF REFORM STRATEGIES

the public to question the usefulness of assess-ments and accountability programs and weakenpolitical support. Virginia was forced to ad-dress this issue after only 2 percent of its schoolsshowed satisfactory performance on the firstadministration of the Standards of Learningexams, on which school accreditation will even-tually be based. Setting high standards for stu-dent learning is not incompatible with a strategyof setting the initial standards at a lower leveland then gradually ratcheting them up overtime, as Texas is doing. Both strategies can getus where we need to go, if political supportremains strong and thoughtful adaptations aremade as experience is gained. In Texas, thishas meant introducing a more rigorous set ofstandards and tests over time. In Virginia, ithas meant keeping the Standards of Learningin place but delaying by several years the date

Page 30: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

21

instruction and assistance are available for in-dividuals who are denied promotion or gradu-ation, and whether specific steps should betaken to motivate students to do their best ontests rather than just score well enough to getpromoted or to graduate.

Using multiple measuresEven the best test is imprecise to some de-

gree. Student scores are expected to vary acrossdifferent versions of tests, reflecting both thespecific sample of questions asked and transi-tory factors, such as the student’s health on the

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

when sanctions for poor-performing schools(loss of accreditation) will kick in. Both stateshave shown large increases in the number ofstudents passing state standards-based examssince they were first introduced.

Designers of test-based accountability sys-tems must address a series of issues that affectstudents and educators, respectively.

USING PERFORMANCE MEASURESTO REWARD AND SANCTIONSTUDENTS

Probably the best-known use of tests is tomake decisions affecting students: “ending so-cial promotion” by employing test scores todetermine readiness for promotion to the nextgrade and developing “high school exit tests”on which to base the awarding or withholdingof high school diplomas. (See box: StatesUsing Tests for Promotion Decisions and box:States Using Tests For Graduation Decisions.)These are highly sensitive decisions, so it isimportant to consider how test scores shouldbe used in making them, whether appropriate

STATES USING TESTS FORPROMOTION DECISIONS

States that have or will have promotion policiesbased, at least in-part, on performance on stateassessments

ArkansasCaliforniaDelaware

District of ColumbiaFloridaIllinois

LouisianaNorth Carolina

OhioSouth Carolina

TexasVirginia

Wisconsin

TOTAL = 13

SOURCE: Glidden (1999).

STATES USING TESTS FORGRADUATION DECISIONS

Students must passtests covering10th grade standards

STATE to graduate

Alabama YesAlaska Class of 2002Arizona Class of 2002California Class of 2004Florida Class of 2003Georgia YesLouisiana Class of 2003Maryland Class of 2007Massachusetts Class of 2003Mississippi Class of 2003Nevada YesNew Jersey Class of 2003New Mexico YesNew York YesNorth Carolina Class of 2003Ohio Class of 2005South Carolina Class of 2005Tennessee Class of 2005Texas Class of 2005Virginia Class of 2004Washington Class of 2008TOTAL 21

SOURCE: Education Week (2001:79).

Page 31: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

22

Measuring What Matters

day of a test. Moreover, no single test can fullyreflect what a student knows and can do. Forthese reasons, a fundamental precept of goodtest practice is that “an educational decisionthat will have a major impact on a test takershould not be made solely or automatically onthe basis of a single test score. Other relevantinformation about the student’s knowledge andskills should also be taken into account.”48

Professional standards and virtually all test pub-lishers warn against relying on tests alone tomake important educational decisions.

Such warnings are not always heeded, how-ever. Test scores offer an easy way to makedecisions about large numbers of students in ashort period of time, while case-by-case evalua-tions incorporating multiple measures likegrades and teacher evaluations are more time-consuming and prone to special pleading bystudents and parents. To the extent that pro-motion or graduation is heavilyinfluenced by judgmental con-siderations rather than a more“objective” criterion, the mean-ing of the underlying educationalstandard students are being askedto meet becomes harder to con-vey to the public.

We believe confusion over theunderlying standard is especiallyproblematic for decisions to re-quire satisfactory performance onhigh school exit tests as a condi-tion for receiving a diploma. Oneof the major complaints that col-leges and employers have is thatthey no longer know what a highschool diploma signifies in termsof what a graduate knows andcan do. (See Figure 6.) Standards-based graduation exams holdpromise for remedying this situ-ation.

The issue, though, isn’t assimple as whether or not to denya diploma to a student who failsgraduation test requirements.

Some states are considering the use of differ-entiated diplomas, not only to indicate whohas and hasn’t passed the exams but also torecognize exceptionally strong performance,thus giving students an added incentive to dowell and not just “get by.”49 Differentiated di-plomas might also provide a way to signal that astudent has shown the persistence and disci-pline (traits employers value) to finish school,even if he or she hasn’t been able to pass thetests. While these are worthy goals, we fear thatthe tradeoff will be a proliferation of creden-tials, varying by state, that will be hard forcolleges and employers to interpret.

We acknowledge that there may need to bea transition period, as new standards and re-quirements come into being, but we urge thatpolicy makers move toward diplomas that tellcolleges and employers that their holders haveat minimum passed any required high school

SOURCE: Public Agenda (2000b).

Figure 6

Uncertainty Over the Meaning of aHigh School Diploma

Percent of Respondents Saying That a Diploma is No Guarantee That theTypical High School Student Has Learned the Basics

Employers

Professors

Students

Teachers

Parents

0% 25% 50% 75% 100%

60

65

21

25

31

Page 32: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

23

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

exit tests. There are practices that can be imple-mented to avoid too heavy a reliance on asingle test for awarding the diploma. One is toensure that all students have multiple opportu-nities to take the tests. Another is to offer testsin a number of subjects and require that many,but not all, be passed in order to graduate.

We think that the arguments for using mul-tiple measures in addition to tests are strongerfor promotion and retention decisions. Here,the primary concern is not sending a signal tothe outside world about the student’s level ofaccomplishment, but trying to place the stu-dent in the most appropriate educational set-ting to support his or her learning. We urgethat promotion decisions include advice fromteachers who know the student and the educa-tional opportunities available in the school anddistrict. We note that Chicago changed its pro-motion rules in August 2000 to allow teachersto determine whether students should be pro-moted based on considerations of class grades,other test scores, attendance, and behavior inaddition to scores on the Iowa Test of BasicSkills, which had previously been the only fac-tor taken into account. The burden of addingadditional considerations to promotion deci-sions could be minimized by bringing in mul-tiple measures only for students whose scoresput them within some specified distance fromthe cut-off score for promotion.*

Availability of appropriate instructionand assistance

We urge policy makers to recognize that theuse of tests for promotion or graduation deci-sions faces political hurdles and risks legal chal-lenge unless students are provided withadequate academic preparation for the testsand with intensive instruction for those whofail.

Some educators and parents fear that hold-ing children back academically will result ingreater numbers of school dropouts. As test-based promotion and graduation policiesspread, this is certainly a possible consequenceto be monitored. However, the consequences

of holding children back must be weighedagainst the consequences of promoting theminto grades or colleges or workplaces wherethey are woefully unprepared to do the work.One crucial key to making new policies benefi-cial for students who fail promotion and gradu-ation tests is to provide them with instructionthat will help them catch up. One reason hold-ing students back led to higher dropout ratesin the past50 is that unsuccessful test-takers wereusually recycled through the same educationalexperiences that had proven unsuccessful forthem. It is encouraging to see that today somepolicy makers are addressing the need forinnovative and intensive instructional invest-ments to help students who have failed or whoare at risk of failing “high-stakes” tests. Chicagooffers one example of a multi-prongedapproach to “ending social promotion,” cou-pling tests with a variety of new instructionalstrategies. (See box: Chicago’s Multi-ProngedApproach to “Ending Social Promotion.”)

Such an approach will help protect test-based promotion and graduation policiesagainst court challenges or sanctions imposedby the U.S. Department of Education’s Officefor Civil Rights (OCR). To avoid challengesand sanctions, assessment policies must con-form to many rules about test use that arerooted in the U.S. Constitution, federal civilrights statutes, and judicial decisions, as well asin state law.

Because of the protections offered by theserules, the National Commission on Testing andPublic Policy found that “the most commonway to challenge important tests is through thecourts.”51 High-school graduation tests havebeen the subject of a number of lawsuits. Re-cently Louisiana’s test-based promotion policywas unsuccessfully challenged by a parents’group in federal court; the OCR has receivedcomplaints from Louisiana and four other statesover promotion policies.52

Promotion and graduation tests are mostlikely to be challenged on two grounds: thatthey are discriminatory (having disproportion-ate impact on poor and minority children) or

*See memorandum by CAROL J. PARRY (page 43).

Page 33: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

24

Measuring What Matters

that they violate students’ due process rights(by failing to give them adequate notice aboutnew test policies or by testing knowledge andskills they have not been taught). Tests appearmost likely to withstand legal challenge whentheir use is judged to be “educationally neces-sary” (following professional test standardshelps defendants meet their burden of proofhere), when students are given sufficient ad-vance notice of high-stakes test requirements,when tests are judged to be fair measures ofwhat has been taught, and when there are edu-cational supports in place for students whostruggle with the exams.53

While policy makers and educators mustrecognize that tests can and will be challengedif they are used inappropriately, it is also im-portant to note that the courts and OCR havegenerally not found reason to overturn newtest-based promotion and graduation policies.In a widely-watched case, a federal court inTexas in January 2000 rejected a challenge tothe state’s use of the Texas Assessment of Aca-demic Skills as a requirement for high schoolgraduation.54 OCR has reached agreements with

In the Chicago Public Schools (CPS), studentsin the third, sixth, and eighth grades must meetminimum test scores on the Iowa Tests of BasicSkills (ITBS) in reading and mathematics inorder to be promoted to the next grade level.Chicago’s initiative to “end social promotion”goes beyond testing to provide progressivelytargeted intervention, aimed at improving theachievement of students with the lowest skills.

In the year before promotion, the CPS policyaims to focus teacher attention on those studentswho are not mastering the material. In addition,students who are at risk of not meeting the mini-mum scores are given extended instructionaltime during the school year through Lighthouse,an after-school program where students engagein a centrally developed curriculum focused onreading and mathematics.

the four states against which claims had beenfiled about promotion tests, permitting thestates to use the tests if they took steps to pro-vide things like summer school and acceler-ated programs to struggling students.55 (Thefifth complaint, recently filed against Louisi-ana, is pending as this report is being written.)

Motivating students to do their bestAs Education Week recently asked, “what if

the schools gave a test and nobody—or at leastnot many of the students taking it—cared?”56

Without consequences based on test results,older students in particular may not bother totry very hard on new state assessments. Sincehow students perform on these assessmentshas growing consequences for principals andteachers (see the next section), the questionhas implications for school as well as studentaccountability. Unmotivated student test tak-ers may produce results that do not accuratelyreflect the quality of education at their schools,complicating efforts to hold educators account-able for results.

Policies linking graduation to passing state

Should students fail to meet minimum testscores at the end of the school year, they arerequired to participate in a six-week SummerBridge Program. This second major componentof the CPS initiative offers smaller classes and acentrally mandated curriculum aligned with theformat and content of the ITBS. A decision ismade at the end of the summer about whetherto promote or retain students who again fail toachieve minimum test scores in one or bothsubjects.

The third component of the initiative focuseson those students who are retained. Schoolswith high proportions of retained students aregiven extra teachers, both to reduce class sizeand to give extra support to retained students.Retained students are also required to partici-pate in the Lighthouse after-school program.

SOURCES: Roderick et al. (1999); Roderick et al. (2000).

CHICAGO’S MULTI-PRONGED APPROACH TO “ENDING SOCIAL PROMOTION”

Page 34: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

25

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

tests, of course, do provide students with somemotivation to perform at least satisfactorily. Ifall that matters, however, is whether the stu-dent passes or not, students still have littlereason to try to do their best, so that test scoreswill not accurately reflect their own and theirschool’s performance.

We urge businesses and colleges to signalthat they care about how students perform onstandards-based tests, as well as in academicwork more generally, by asking that test scoresbe included on high school transcripts and byusing the results of these tests along with stu-dent grades and judgments about the rigor ofcourses taken in hiring and admission deci-sions. We encourage businesses to participatein the Making Academics Count project,through which companies pledge to useevidence of students’ academic achievementin their entry-level hiring decisions. (See box:Making Academics Count.) We encouragestates and colleges to consider how standards-based testing and college admissions require-ments might be linked, as is currently beingexplored in Illinois. (See box: The Prairie StateAchievement Examination.)

MAKING ACADEMICS COUNT

Since 1997, the National Alliance of Busi-ness, the Business Roundtable, and the U.S.Chamber of Commerce have led a business-sponsored nationwide effort to reinforce theconnection between school performance andworkplace success. The campaign, Making Aca-demics Count, intends to raise student achieve-ment by encouraging employers to ask forstudent records when hiring entry-level employ-ees. By doing so, employers can send the mes-sage to students that “yes, school counts.”

The initial goal of the campaign was to haveat least 10,000 companies of all sizes asking forschool records and other profiles of academicperformance as part of the hiring process. Thecampaign has surpassed that goal. Its sponsorshave vowed to increase their efforts and con-tinue raising that number.

SOURCE: National Alliance of Business (2000).

USING PERFORMANCE MEASURESTO REWARD AND SANCTIONEDUCATORS

Forty-five states and a number of large cityschool districts have established school account-ability systems to report on or rate school per-formance. Test scores are the most frequentlymeasured outcome, sometimes accompaniedby attendance and/or dropout/graduationrates and other information.57 Most account-ability systems rely on making information onstudent outcomes widely available to generatepressure for school improvement. A few offerfinancial rewards to educators in schools thatare performing well; some apply sanctions tothose that are performing poorly. (See box:Accountability Systems Featuring Financial Re-wards.) Sanctions range from additional over-sight by district and state education authoritiesto penalties such as “reconstitution”: removingthe existing principal and staff and essentiallyreconstructing the school from scratch. Au-thorities have been given the power to recon-stitute schools in a number of states and districtsand have exercised it in such places as SanFrancisco and Baltimore. Florida has takenanother route to sanctioning failing schools:providing students with vouchers to attendnonpublic schools if they wish.

Accountability systems for educators can fo-cus on individual teachers and principals or onthe school as a whole. At present, most focuson schools.

Accountability for individual teachersNew performance-based accountability sys-

tems for educators generally avoid singling outindividual teachers. The preference for group,or school-based, accountability reflects a his-tory of unsuccessful attempts to implement in-dividual incentives such as “merit pay” ineducation. While teacher unions (about four-fifths of American teachers are unionized) con-tinue to oppose linking job performance topay,58 some innovative experiments involving

Page 35: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

26

Measuring What Matters

local union cooperation are underway aroundthe country.

Merit pay, in the generic sense, is a paysystem that links teacher pay to some measureof performance rather than to the prevailingsingle salary schedule. Typically, teachers arenow paid according to a state or district sched-ule that gives higher pay to teachers with moreexperience and with advanced degrees.

Districts have experimented with variousforms of merit pay for individual teachers for along time, but seldom have stuck with them forlong. Merit pay programs have been unpopu-lar with teachers and have also suffered fromhigh costs and administrative burdens. Districtsand states often failed to provide stable fund-ing for the programs. Where merit pay pro-grams did survive (generally in wealthier

THE PRAIRIE STATE ACHIEVEMENT EXAMINATION

Illinois is working with ACT to “wrap” standards-based test components around the ACT collegeentrance exam (and components of its Work Keys battery) for multi-score reporting. The Illinois StateBoard of Education (ISBE) designed the Prairie State Achievement Examination (PSAE) to contain aset of existing tests that measure the Illinois Learning Standards, including tests used for college admis-sions and for measuring workplace skills so students could receive the results of these tests as a bonus.To that end, the PSAE incorporates three separate test components: (1) ISBE developed writing, sci-ence, and social science assessments; (2) the ACT Assessment, which includes English, mathematics,reading, and science reasoning tests; and (3) two Work Keys (WK) assessments, Reading for Informa-tion and Applied Mathematics. The components are combined as shown below:

Component Tests PSAE Test Scores

ACT Reading + WK Reading = PSAE ReadingACT English + ISBE Writing = PSAE WritingACT Mathematics + WK Mathematics = PSAE MathematicsACT Science Reasoning + ISBE Science = PSAE ScienceISBE Social Science = PSAE Social Science

PSAE scale scores and performance levels are reported for each of the five subjects. In addition, thePSAE generates an ACT Assessment Composite Score, ACT Subject Scores (4 subject scores, 7 sub-scores), and 2 Work Keys Scores. Illinois has worked hard to ensure that colleges will accept ACT As-sessment scores achieved through PSAE state-required testing. The Illinois Student Assistance Commis-sion has confirmed that it will recognize ACT Assessment scores achieved through PSAE testing for usein awarding state scholarships. Colleges and universities have indicated willingness to accept and useACT Assessment scores reported from “state” testing (as opposed to the national ACT Assessment test).

SOURCES: Illinois (1999); Illinois State Board of Education (2000).

districts) they tended to evolve from true meritpay plans, where teachers are rewarded forbetter work, to plans in which teachers arerewarded for taking on more tasks.59 Instead oftrue pay for performance plans, some districtshave recently adopted compensation plans thatlink pay not to students’ academic achieve-ment but to the acquisition of specific knowl-edge and skills. Such training, like thatevidenced by advanced certification by the Na-tional Board for Professional Teaching Stan-dards, is typically aligned with the requirementsof standards-based reform and is thought to beeffective in promoting student learning.60

Pay for knowledge and skills is an improve-ment on the single salary schedule and may bea necessary first step in many places towardpay for performance. It does not, however,

Page 36: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

27

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

Some states and districts provide financialrewards to schools and teachers that are tieddirectly to student performance. For example:

North Carolina establishes student performancegoals for each school and gives $750 rewards toall the teachers at schools that meet their goals.Teachers at schools exceeding their goals by 10percent or more receive $1,500 each.

Dallas rewards its most effective schools on thebasis of a statistically-sophisticated School Effec-tiveness Index that adjusts test scores and otherschool performance measures to take explicitaccount of differences among students andamong schools. Adjusted index scores are com-pared across schools, with the top schools (thespecific number dependent on fund availability)receiving rewards. All professional staff inselected schools receive awards of $1,000, withsupport staff receiving $500 and the schoolreceiving $2,000 for its activity fund.

California enacted a law in July 1999 that pro-vides a one-time performance bonus award of upto $25,000 per teacher in underachievingschools that improve their Academic Perfor-mance Index (API) by at least 2 times theschool’s annual growth target. The CaliforniaCertificated Staff Performance Incentive Actprovides bonuses of $25,000 to 1,000 teachers inschools with the largest absolute API gain,$10,000 to 3,750 teachers in schools with thesecond-largest gain, and $5,000 to 7,500 teachersin schools with the third-largest gain. Bonusdistribution is determined by negotiation be-tween the local governing board and the teach-ers’ union.

Denver teachers agreed in late 1999 to a four-year pilot program that will investigate three

ACCOUNTABILITY SYSTEMS FEATURING FINANCIAL REWARDS

different approaches to linking individual teach-ers’ compensation to student performance.Schools can participate in the pilot if 85 percentof their teachers approve. Teachers are eligiblefor up to $1,500 in bonuses. Three different waysof measuring teacher performance will be tested:increases in student performance on standard-ized tests; increases in student performance onteacher-developed assessments; and increases inteachers’ skills and knowledge.

Lansdale, PA (Colonial School District) beganimplementing an achievement award program inschool year 1999-2000 that rewards both indi-vidual teachers and groups of teachers within aschool. Individual teachers are ranked on howtheir students’ academic achievement comparesto those of a comparison group of teachers.Group achievement awards are granted based onthe success of schools in meeting agreed-uponannual goals. Individual awards range from$1,389 to $2,778, while the maximum award anindividual teacher can receive under the groupaward program is $2,500.

Maryland began offering monetary bonuses in1996 to schools showing significant and sus-tained improvement on the state assessment.Schools that show two years of statistically signifi-cant progress on a School Performance Indexmay be eligible for the reward. School Improve-ment Teams are the recipients of the awards andmust use them on educational programs, but noton staff salary or bonuses. The amount of eachschool’s award depends on the number ofschools qualifying for the awards and the enroll-ment of each school receiving an award. In 2000,the average financial award was $49,000 perschool.

create accountability for individual teachers,which we believe should be the ultimate objec-tive.

We encourage districts to undertake moreaggressive experimentation with new pay andbonus plans that link individual teachers’ com-pensation directly with their ability to fosterstudent learning and that engage teachers co-

operatively in designing and evaluating thesenew arrangements. Teacher engagement is im-portant, for both substantive and political rea-sons. But we believe teachers must becomemore receptive to these plans than has histori-cally been the case. Their concerns about pos-sible negative effects are not unfounded, butneither are they unique to merit pay arrange-

Page 37: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

28

Measuring What Matters

ments in educational settings. Moreover, payfor most workers outside education is based atleast in part on merit considerations. Teacherscannot in our view continue to exempt them-selves from accountability for outcomes andexpect public understanding and support.*

We acknowledge, though, that there is stillmuch to learn about how to design perfor-mance-based compensation plans for individualteachers and about their effects. Merit pay planshave tended to come and go rather quicklyand have not been carefully evaluated.61 Whilesome states (such as Florida and Delaware)have mandated pay for performance, they haveoften left the details to be worked out throughprocesses that have not yet taken place. Only afew districts have adopted or are experiment-ing with performance-based compensation sys-tems. Linking teacher performance to studentperformance requires state and district systemsthat both measure student achievement andlink students to the teachers who have taughtthem. As we’ve already seen, most schools can-not now make such linkages for teachers inmany grade levels and subjects. Therefore, webelieve that careful experimentation and evalu-ation rather than a head-long rush to adoptnew pay-for-performance programs for indi-vidual teachers is the right approach at thisstage.

Accountability for schoolsThe growing popularity of school-based ac-

countability marks an important step on theroad to a performance-based education sys-tem. Holding the educators in a school collec-tively responsible for students’ academicachievement and its growth emphasizes out-comes, not inputs, and puts the locus of ac-countability at the school level for the firsttime. Historically, very little information wascollected about individual schools; most of whatwe knew about education was based on districtor state-level data.

Moving accountability to the school level isimportant because it directly affects those whoactually interact everyday with students in class-

rooms. If used effectively, school-level account-ability can motivate changes in teaching moredirectly than accountability diffused acrossmany schools or districts. It should make iteasier to see which schools are performing welland which are not.

Using tests to hold schools accountable, however,heightens the importance of the issues regarding theuse of tests we have discussed in earlier chapters,especially issues of fairness that are politically andtechnically complex. How should accountabilitysystems deal with the well-established fact thatmany factors besides school influence astudent’s academic performance at any giventime: factors such as prior educational experi-ences, family background, and nonschool edu-cational opportunities? Are there differencesin the stability and reliability of different mea-sures of school performance? How should thefact that student performance varies morewithin individual schools than it does amongschools affect the design of accountability rat-ings and rankings?

Accountability systems today typically relyon average test scores or percentage pass ratesto measure school performance. Many analystsare critical of this approach, because such “sta-tus” indicators reflect many factors affectingachievement other than those controlled bythe school. Such indicators can therefore bemisleading about what the school contributesto student achievement, failing to show howwell the school actually educates its students.The path-breaking Texas accountability systempartially addresses these concerns by basing itsschool ratings not just on the average test passrate in each school, but also on the pass ratesof selected subgroups. (See box: The TexasApproach to Accountability.) The state alsoissues a “comparative improvement” report thatshows each school how much year-to-year im-provement its students have achieved comparedto 40 other schools serving similar populations.

More elaborate “value-added” accountabil-ity models based on improvements in studentperformance are sometimes used to avoid theproblems caused by using average or status

*See memorandum by CAROL J. PARRY (page 44).

Page 38: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

29

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

scores in school accountability programs.62 Theyreport scores in terms of the growth in studentachievement rather than the absolute level ofachievement and attempt to identify theamount of improvement for which the schoolis responsible. Ideally, statistical procedureswould be used to eliminate the effects of (1)nonschool factors that contribute to the growthin achievement and (2) differences among

schools, such as financial resources, that arenot under the school’s control. Unfortunately,the information needed to make these adjust-ments is not typically available. States and dis-tricts instead have developed less completemodels that use prior test scores in variousways to determine how much of the change instudent performance should be attributed tothe school’s efforts.63 While not widely used,such value-added accountability systems havebeen tried in several states—North and SouthCarolina and Tennessee—and districts—Dallas and Minneapolis.

An ideal scoring system designed to gaugeeducational improvement would measure theprogress made by individual students over time.Often, however, states and districts do not haveadministrative systems that can link individualstudents and their test scores from year to yearand must use other approaches. For example,they may look at the gains achieved by thesame cohort of students over time, even thoughthe population in the cohort is not exactly thesame (because of student mobility, for ex-ample).

Different measures of school performancemay give very different results. A recent studycompared nine different measures developedfrom the scores of fifth graders in South Caro-lina schools, all based on their standardizedtest scores.64 These measures included severalvariants of average scores and value-addedscores, some adjusting for nonschool factorssuch as socioeconomic status and race and oth-ers not. There were substantial differences inrankings depending on the measure employed.Schools with more white and relatively affluentstudents tended to score highly using averagescores, but this advantage almost disappearedin most cases when value-added scores wereused.

There are important tradeoffs to be consid-ered in deciding how much information toinclude in accountability systems. Value-addedmeasurements that track individual studentsand provide information on student progressthat can be linked back to individual teachers

THE TEXAS APPROACH TOACCOUNTABILITY

The Texas Education Agency rates everydistrict and every school within a district aseither Exemplary, Recognized, AcademicallyAcceptable/Acceptable, or Academically Unac-ceptable/Low-Performing, based on threeindicators: (1) student scores on the TexasAssessment of Academic Skills (TAAS) tests inreading, writing, and mathematics, (2) studentdropout rates, and (3) student attendancerates. To assign ratings, the agency breaksdown each district and school into four studentsubgroups – African American, Hispanic,white, and economically disadvantaged. For aschool or district to receive a specific rating,not only must the school/district as a wholemeet the specified criteria, so too must each ofthe four student subgroups. For example, toreceive a rating of Exemplary:

• At least 90% of all students must pass eachTAAS subject area test, and at least 90% ofeach subgroup of students must also passthe tests.

• The dropout rate cannot exceed 1% for allstudents, or 1% for any student subgroup.

• There must be at least a 94% attendancerate for all students and for all student sub-groups.

To be rated at a specific level, each studentsubgroup as well as the total student popula-tion must meet the minimum criteria for thatlevel. A school or district can only be ranked ashigh as its lowest performing subgroup of stu-dents would be ranked.

SOURCE: Texas Education Agency (1999).

Page 39: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

30

Measuring What Matters

and classrooms demand costly new commit-ments by districts and states. They require thattests be given more frequently than is nowtypically the case: every year or every other yearinstead of in just three or four grades. Theyalso require new data systems that combinestudent test scores with information on stu-dent, family, and community characteristics andthat link students and their teachers.65 To keepcosts down, administrators may be tempted tore-use tests rather than develop new ones anduse inexpensive tests that are less effective thanthose envisioned by standards-based reform-ers.66

Because test-based accountability is intended tocreate incentives for schools to improve their perfor-mance, it is important to get the incentives right.Even the most sophisticated of the value-addedaccountability models currently in use cannotdefinitively identify the effects on achievementthat are under the control of the educators inthe school. These models are nevertheless use-ful because they provide important informa-tion to parents, citizens, and policy makers.Poor performance draws attention to schoolswhere changes are required to make them moreeffective.

When test-based accountability is used asthe basis for rewards and sanctions for specificschools and their personnel, however, the ac-curacy and reliability of these models becomefar more important. A model that incorrectlylabels a school as “low-performing” for example,could lead high-quality principals and teachersto shun employment there in favor of “high-performing” schools that promise better re-wards.

There is evidence urging caution in thisregard. Recent research examined accountabil-ity models in North Carolina, which ratedschools serving students with high averagescores as also more effective in improving an-nual performance than schools serving studentswith low scores.68 Detailed analysis indicatedthat correcting for measurement error and (andto a lesser extent, for a misspecified model)

substantially changed school rankings. Thesecorrections raised the effectiveness measurefor many schools with low average test scoresand reduced it for many schools with highaverage scores.

Another study used the North Carolina datafrom a five-year period to determine the stabil-ity of school rankings over time.69 It found thatschools at both the top and the bottom of theannual relative rankings tended to be smallschools, even though the average performanceof small and large schools was similar. It ap-pears that the annual rankings of schools overtime are quite unstable and that the measuredannual differences may not indicate real differ-ences in school performance.70

These studies indicate that the design ofaccountability systems is critical to their effec-tiveness and credibility and underscore theneed for improved understanding of the ef-fects of specific design features.

Continuous improvement is essential to makeaccountability work. The higher the stakes attachedto accountability, the more crucial it is that measure-ment systems be reliable, valid, fair, and comprehen-sible. We have much to learn about designingaccountability systems that accomplish our goals. Tothis end, we urge accountability supporters topromote investment in research on and analy-sis of such systems.

GUIDELINES FOR GOODPRACTICE

Robert Linn, one of the country’s foremostexperts on testing and assessment, has recentlyoffered seven suggestions for “enhancing thevalidity, credibility, and positive impact of as-sessment and accountability systems while mini-mizing their negative effects.”71 They capturewell the implications of our discussion of usingachievement measures to hold students andeducators accountable. We commend the fol-lowing suggestions to business leaders and othercitizens as guidelines to good practice. Weshould insist that accountability systems:

Page 40: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

31

Chapter 4: Using Achievement Measures to Hold Students and Educators Accountable

1. Provide safeguards against the selective ex-clusion of students from assessments; e.g.,by including all students in accountabilitycalculations.72

2. Utilize new high-quality assessments eachyear that are statistically equated to those ofprevious years.

3. Not put all of the weight on a single test;instead, seek multiple indicators.

4. Place more emphasis on comparisons ofschool performance from year to year thanfrom school to school. This allows for dif-ferences in starting points while maintain-ing an expectation of improvement for all.

5. Consider both value added and status in thesystem. Value added provides schools thatstart out far from the mark a reasonablechance to show improvement while statusguards against institutionalizing low expec-tations for those same students and schools.

6. Recognize, evaluate, and report the degreeof uncertainty in the reported results.

7. Put in place a system for evaluating theeffects of the system: positive and negative,intended and unintended.

Page 41: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

32

Chapter 5

MONITORING EDUCATIONALPROGRESS AND THEMEASUREMENT SYSTEM ITSELF

Besides helping improve instruction for in-dividual students and providing a means forholding students and educators accountable,measurement of academic performance alsoprovides information about the status of theeducation system. Shifting our focus from in-structional improvement and accountability tomonitoring raises three additional issues thatshould be part of the discussion of educationalmeasurement: (1) the importance of continu-ing support for national and internationalassessments to provide some common metricsin our otherwise fragmented assessment sys-tem; (2) the need to expand and improve ef-forts to make information about theperformance of the education system availableto the public; and (3) the desirability, given thegrowing consequences of testing, to build insti-tutions capable of monitoring the assessmentsystem itself.

NATIONAL AND INTERNATIONALASSESSMENTS

Though we applaud the progress that hasbeen made over the last decade by states andschool districts in developing standards, assess-ments, and accountability systems, we also ob-serve that this decentralized approach tostandards-based reform does not provide datawith which to compare performance acrossstates, nor does it provide an overall picture ofthe health of the American educational sys-tem. Some have hoped that it would be pos-sible to link different state and commercial

tests to each other. However, the NationalResearch Council studied this issue in responseto a congressional mandate and concluded thatcreating linkages that would permit multiplecommercial and state achievement tests to becompared to one another is not feasible.73

Thus, the growth in state and district testing doesnot diminish the importance of continuing supportfor the National Assessment of Educational Progress(NAEP) and for American participation in interna-tional assessments.

For 30 years NAEP has served as “the nation’sreport card,” providing a continuing measureof student achievement in key subject areas.NAEP periodically tests a sample of students inthree grades, one each in elementary, middle,and high school. For two decades, scores werereported on a national basis only. For the lastten years, scores have also been reported on astatewide basis for those states that choose toparticipate. NAEP thus serves not only as anational barometer of educational progress,but also as a check on states’ progress as mea-sured by their own assessments.

State participation in NAEP is voluntary.Forty-eight of the fifty states signed agreementsto participate in the NAEP 2000 mathematicsand science assessments, up from 37 participat-ing states when the state assessment programbegan in 1990. With the growth in other test-ing programs, however, schools are apparentlyfinding it increasingly difficult to allocate thetime and resources to participate in NAEP.Eight of the states that originally committed toNAEP 2000 had to drop out because too few

Page 42: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

33

Chapter 5: Monitoring Educational Progress and the Measurement System Itself

schools agreed to participate to meet statisticalsampling requirements. Participation in stateNAEP requires significant commitments oftime, resources, and effort at both the districtand local levels.

Over the next few months, the NAEP gov-erning board will be considering recommen-dations74 from an Ad Hoc Committee on NAEPParticipation to create incentives for schools toparticipate. Some would involve new costs (e.g.,using paid contractors rather than school ad-ministrators to administer the versions of NAEPthat result in state-specific scores, as is alreadydone in the version of NAEP that results innational scores), and some would require con-gressional action (e.g., providing school-specific scores for schools who want them). Weurge the NAEP board to adopt policies aimedat improving school participation and urge Con-gress to approve these changes, including bud-getary increases if necessary.

International assessments such as the ThirdInternational Mathematics and Science Study(TIMSS—conducted in the mid-1990s and re-cently updated) supplement NAEP by provid-ing additional benchmarks against which tomeasure the performance of American stu-dents. TIMSS taught the United States impor-tant and sometimes sobering lessons, such asthat even the best of our high school seniorsdo not perform as well in math and science astheir counterparts in other countries. The fed-eral government has been a key supporter ofinternational assessments, funding U.S. par-ticipation and providing essential financial as-sistance and design advice to the internationalcoordinating agencies. Future internationaltesting depends on this kind of support, alongwith the continued willingness of schools toparticipate. We urge Congress to provide suffi-cient funding to the National Center for Edu-cation Statistics to support internationalassessments of student achievement, and weencourage American schools to accept wheninvited to participate.

REPORTING TO THE PUBLICThough 45 states reported in 2000 that they

planned to issue report cards on schools, only32 percent of parents reported that such re-port cards were available.75 This discrepancysuggests that there is still much to do to get perfor-mance data into the public’s hands and help parentsand others learn how to use it.

Earlier in this report we talked about put-ting performance data into the hands of edu-cators so that data become tools of instructionalimprovement. Data for parents and the publicserve different needs: monitoring the perfor-mance of schools and (where school choice ispermitted) providing information upon whichparents can decide where to enroll their chil-dren.

In this report we cannot give adequate at-tention to the full range of issues, includinghow to select and format information to meetpublic demands as well as to reflect schoolperformance accurately, that must be addressedin the design of school report cards.76 Aspotential consumers, however, we recommendthat report cards include the following fea-tures:

• Common formats across districts. Whilestates may require districts to prepare schoolreport cards and may specify the informa-tion that must be included, they do notnecessarily require that the information bereported in common formats. Different for-mats make it hard to pull some useful infor-mation from report cards, such as whetherschools in different districts are equally suc-cessful in the rate at which their studentsare meeting academic standards. School re-port cards should include comparable in-formation about the number of studentswho have been excluded from the testingprogram or whose scores have been omit-ted from accountability calculations.

• Contextual information. Most school reportcards still present test score averages, which

Page 43: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

34

Measuring What Matters

reflect what students know but not neces-sarily how well schools are performing.77 Aswe explained in Chapter 4, test scores arehighly correlated with students’ socioeco-nomic and ethnic status and reflect out-of-school as well as in-school experiences. It istherefore helpful in interpreting individualschool results to have background informa-tion about the students enrolled. Otherkinds of contextual information can also beuseful: e.g., student mobility rates (in someurban schools, a majority of students movein or out in the course of a single year),school and class size, per-pupil spending,and the number of children who areEnglish-language learners. While focusgroup research has suggested that the pub-lic rates demographic data low on its list ofdesirable information about schools,78 wesupport the inclusion of this informationand urge the public to make use of themore nuanced picture of school perfor-mance it provides.

• Disaggregated data. Whether achievementdata are reported as averages or as somemeasure of improvement, it will be moreinformative if they are disaggregated forparticular groups of students and not justreported as school totals. Since the goal ofstandards-based reform is to have all chil-dren achieving at high levels, it is importantto know whether schools are boosting theperformance of all their students. Report-ing results for different subjects and differ-ent grade levels by student groups (e.g.,African Americans, Hispanics, white, andeconomically-disadvantaged students as isdone in Texas) may be too cumbersome forsummary report cards routinely distributedto all parents. The data should be readilyavailable to anyone who wants them, how-ever, and should be considered by the pressand others who report on school perfor-mance. Since subgroup reporting reducesthe number of students in each group, rela-tive to the school as a whole, measurement

error becomes more significant; so reportsshould include information on the marginsof error associated with the data.

• Explanations of measures. To make aca-demic achievement data more meaningfulto parents and others, numerical scores areoften given labels that represent levels ofaccomplishment: e.g., basic, novice, profi-cient, and advanced. The labels applied todifferent tests may sound similar when infact they might mean quite different things.The solution is to include clear explana-tions of these labels on report cards.

• “School success” and school environmentdata. The point of report cards is that theybe used. In this regard, it is significant thatthe public may have different views aboutwhat information should be included in re-port cards than other groups for whom suchreports are designed. For example, researchin two cities (Greensboro, NC and Sacra-mento, CA) found that school board mem-bers differed from parents in their informa-tion priorities. Standardized test scores werefar more important for school board mem-bers than for parents, who preferred schoolsuccess and school environment data(graduation and promotion rates beingexamples of the former, school safety andparental involvement indicators beingexamples of the latter).79 While we believethat parents should pay attention to directmeasures of academic achievement andshould have ready access to this informa-tion, we also think it important to acknowl-edge other parental concerns and includethem in school reports.

• Review and continuous improvement. Likestandards and assessments, school “reportcards” should periodically be reviewed andimproved based on experience with theirusefulness to the public and on any unex-pected effects on parent or school behav-ior. For example, Maryland State Superin-tendent Nancy S. Grasmick reports that state-wide groups met for months in 1990 to

Page 44: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

35

Chapter 5: Monitoring Educational Progress and the Measurement System Itself

hammer out the details of a public report-ing mechanism that would include test re-sults. From an initial listing of 100 possiblereport card measures, 13 were initially cho-sen for inclusion, along with data to providethe context for viewing the accountabilitymeasure. A five-year review cycle was estab-lished. After the 1995 review, the indicatorfor student promotion rate was pulled fromthe report because reviewers found it hadbecome a compelling incentive for socialpromotion.80

MONITORING THEMEASUREMENT SYSTEM

Independent organizations that stand out-side the everyday fray of education politics andpolicy making can make important contribu-tions to improving the design and use of assess-ment and accountability systems. They canmonitor and report on the quality of educa-tional standards and assessments, on the in-tended and unintended consequences ofaccountability systems, and on the meaning oftest scores and test score changes over time.Such groups, which may choose to focus onnational, state-wide, or district issues, can servea tremendously useful function in keepingpolicy makers and educators honest about whateducation reform efforts are accomplishing.

We encourage support from foundations,businesses, and other funding sources for suchorganizations. We recommend that policy mak-ers seek out the services of these groups andhelp create them where they do not exist. Weurge the public to expect policy makers tosubject assessment and accountability systemsto independent evaluation. The following or-ganizations currently provide the kind of ser-vices we envision; the state and local ones canserve as models for replication.

• The American Federation of Teachers, theCouncil for Basic Education, and theThomas B. Fordham Foundation—theseorganizations provide independent evalua-

tions of state standards. They use differentcriteria, so that their ratings vary. By mak-ing their criteria transparent, however, theyfoster informed discussion about the con-tent of standards and whether they are mea-suring what the public believes to be impor-tant.

• Achieve, Inc.—was founded by governors andbusiness leaders after the 1996 EducationSummit to promote standards, assessments,and accountability as means to school im-provement. Achieve provides informationon state standards and on state progress inimplementing standards-based reform. Atstate request, Achieve conducts indepen-dent evaluations of state standards and tests.Achieve and 10 states have established theMathematics Achievement Partnership todevelop an internationally-benchmarked 8th

grade math assessment which will give par-ticipating states a common metric for com-paring the performance of their students.

• Consortium on Chicago School Research—con-ducts research activities designed to advanceschool improvement in Chicago’s publicschools and to assess the progress of schoolreform. It is an independent, university-based organization sponsored jointly by thecity’s foundations and the public school sys-tem central office. The central office pro-vides test scores and access to the schoolsfor consortium surveys, but the researchersconduct their analyses according to profes-sional standards and make the ultimate de-cisions about what will be published andwhen.

• Maryland Assessment Research Center for Edu-cational Success—a research arm of the Uni-versity of Maryland Department of Measure-ment, Statistics, and Evaluation, the centerconducts basic and applied research toenhance the quality of assessment practiceand knowledge. The state Department ofEducation has contracted with the center toprovide assessment support for the Mary-

Page 45: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

36

Measuring What Matters

land State Performance AssessmentProgram.

• Dallas Accountability Task Force—theDallas Public School System has created itsown accountability program operating ontop of the Texas state accountability system.Dallas employs a statistically-sophisticatedvalue-added model for calculating schooleffectiveness and allocating performance-based awards. The district established anAccountability Task Force, including teach-ers, administrators, members of the schoolboard, and community representatives, tooversee the design of the system and toreview and revise it as needed. The taskforce helps make difficult technical deci-

sions aimed at building a fair system andhelps legitimate the resulting system (whosecomplex inner workings are “less than trans-parent”81 to outsiders) in the eyes of par-ents, teachers, and the public.

These and similar groups help spur the con-tinuous improvement of assessment and ac-countability systems that we call for in thisstatement. They also serve as safeguards againstthe possibility that undesirable effects of thesesystems will go unrecognized or unaddressed.The information they put into the hands of thepublic better equips all of us to engage in ourdemocratic society’s constant debate over thegoals of American education and how best toreach them.

Page 46: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

37

1. This felicitous phrase was used by Judge Lynn N.Hughes, who cited “the buffeting of letters, press confer-ences, speeches, meetings, and the rest of the wonderfulcacophony of a free people disagreeing” in rejecting acomplaint that the Houston (TX) Independent SchoolDistrict board had denied segments of the communitythe right to speak before selecting a new school superin-tendent. Quoted in McAdams (2000:120).

2. Committee for Economic Development (1994).

3. Committee for Economic Development (1985, 1987).

4. Linn (2000).

5. The national education goals, as adopted at the 1989Education Summit and later modified by Congress,essentially state that by the year 2000:

All children will start school ready to learn.

The high school graduation rate will increase to at least90 percent.

All students will become competent in challengingsubject matter.

Teachers will have the knowledge and skills that theyneed.

U.S. students will be first in the world in mathematicsand science achievement.

Every adult American will be literate.

Schools will be safe, disciplined, and free of guns, drugs,and alcohol.

Schools will promote parental involvement and partici-pation.

6. Expressed in Committee for Economic Development(1994).

7. Committee for Economic Development (1994:37).

8. National Research Council (1999c:22).

9. National Research Council (1999d:25).

10. National Research Council (1999d:25).

11. Hill (2000:3).

12. National Research Council (1999a:29).

13. Phelps (1999:25).

14. Two RAND studies, cited in National Research Coun-cil (1999d:81).

15. Klein and Hamilton (1999:13).

16. National Research Council (1999a:30).

17. National Research Council (1999d:66).

18. American Psychological Association (1999). The stan-

ENDNOTES

dards are developed jointly by the American EducationalResearch Association, the American PsychologicalAssociation, and the National Council on Measurementin Education. They address professional and technicalissues of test development and use in education, psychol-ogy, and employment.

19. National Research Council (1999a:250).

20. National Research Council (1999d:43-4).

21. National Research Council (1999a:45).

22. National Research Council (1999a:32).

23. Herrnstein and Murray (1994).

24. National Research Council (1999a:32).

25. Citizens’ Commission on Civil Rights (1998:10).

26. U.S. Department of Education (1999a:Table 53).

27. U.S. Department of Education (1999a:Table 59).

28. Improving America’s Schools Act of 1994, Public Law103-382; Individuals with Disabilities Education ActAmendments of 1997, Public Law 105-17.

29. National Research Council (1999d).

30. U.S. Department of Education (1999b:46).

31. Maryland State Department of Education (2000).

32. Belden Russonello & Stewart and Research/Policy/Management (2000).

33. Duggan and Holmes (2000:4-5).

34. National Research Council (1999d:49).

35. Phelps (1999:24).

36. Koretz (1996:186).

37. See, for example, Goertz (2000:79) and NationalResearch Council (1999d).

38. National Research Council (1999b).

39. Goertz (2000:68).

40. National Research Council (1999d:19-20).

41. National Center for Education Statistics (1999:51).

42. Committee for Economic Development (1994:14).

43. National Research Council (1999d:81). A new surveyof teachers by Education Week provides additional evi-dence that teachers receive little training on understand-ing and using academic standards and test results. SeeEducation Week (2001:45).

44. Hoff (2000).

45. Finn et al. (1999:46).

46. Finn et al. (1999:44).

Page 47: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

38

47. Porter (2000).

48. National Research Council (1999a:3).

49. Olson (2000).

50. Research on the connection between grade retentionand drop-out rates is summarized in National ResearchCouncil (1999a:128-132).

51. National Commission on Testing and Public Policy,From GATEKEEPER to GATEWAY: Transforming Testing inAmerica, 1990, quoted in National Research Council(1999a:251).

52. Robelen (2000:14).

53. National Research Council (1999a:Chapter 3);Robelen (2000:14).

54. GI Forum v. Texas Education Agency, 87 F.Supp.2d 667,142 Ed. Law Rep. 907 (D. TX 2000).

55. Robelen (2000:14). The Office for Civil Rights hasjust published a resource guide to help educators andpolicy makers develop and implement test-use policiesconsistent with legal requirements. See Office for CivilRights (2000).

56. Keller (2000).

57. Education Week (2001:80-81).

58. For example, at their 2000 convention, NationalEducation Association members rejected a plan to usejob performance evaluations in paying wage bonuses. SeeArcher (2000).

59. Research on merit pay plans is summarized inNational Research Council (1999c:177).

60. Odden and Kelley (1997).

61. Hanushek (1996:131).

62. Clotfelter and Ladd (1996); Meyer (1996, 2000).

63. Ladd and Walsh (forthcoming).

64. Clotfelter and Ladd (1996:37).

65. Meyer (2000:4).

66. Linn (2000:13).

67. Ladd and Walsh (forthcoming).

68. The South Carolina findings are based on theaccountability system that was put in place in the mid-1980s and may not reflect the state’s current system.

69. Kane (n.d.:1-2).

70. Other research does suggest that small schools aremore effective than large ones. The study under discus-sion was not designed to investigate this point, but ratherto explain why the highest and lowest ranked schools inNorth Carolina were virtually all small schools.

71. Linn (2000:15).

72. The one exception is that it is appropriate to excludefrom a school-based accountability system students whohave not spent some minimum number of days in theschool.

73. National Research Council (1999e).

74. Ad Hoc Committee on NAEP Participation (2000).

75. See Education Week (2001) for state responses; PublicAgenda (2000a) for parental responses.

76. Gormley and Weimer (1999).

77. Gormley and Weimer (1999).

78. A-Plus Communications (1999:5-6).

79. Jaeger, Richard, Barbara Gorney, Robert Johnson,Sarah Putnam, and Gary Williamson, A Consumer Reporton School Report Cards. Greensboro, NC: Center for Educa-tional Research and Evaluation, University of NorthCarolina at Greensboro (1994), cited in Gormley andWeimer (1999:107).

80. Grasmick (2000:54).

81. Mendro et al. (1999).

Page 48: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

39

Ad Hoc Committee on NAEP Participation

2000 Enhancing Participation in the National Assessmentof Educational Progress. Washington, DC:National Assessment Governing Board,September.

A-Plus Communications

1999 Reporting Results: What the Public Wants toKnow. A companion report to Education Week’sQuality Counts ’99.

American Psychological Association

1999 Standards for Educational and Psychological Test-ing. Washington, DC: American PsychologicalAssociation.

Archer, Jeff

2000 NEA Delegates Take Hard Line Against Pay forPerformance. Education Week 19(42), July 12.

Belden Russonello & Stewart and Research/Strategy/Management

2000 Raising Education Standards and Assessing theResults. Results of a Public Opinion SurveyConducted for the Business Roundtable,August.

Citizens’ Commission on Civil Rights

1998 Executive Summary. In Title I in Midstream:The Fight to Improve School for Poor Kids.Washington, DC: Citizens’ Commission onCivil Rights.

Clotfelter, Charles T., and Helen F. Ladd

1996 Recognizing and Rewarding Success in School.In Helen F. Ladd, ed., Holding Schools Account-able: Performance-Based Reform in Education.Washington, DC: Brookings Institution Press.

Committee for Economic Development

1985 Investing in Our Children: Business and the PublicSchools. New York, NY: Committee forEconomic Development.

Committee for Economic Development

1987 Children in Need: Investment Strategies for theEducationally Disadvantaged. New York, NY:Committee for Economic Development.

Committee for Economic Development

1994 Putting Learning First: Governing and Managingthe Schools for High Achievement. New York, NY:Committee for Economic Development.

REFERENCES

Duggan, Terri, and Madelyn Holmes, eds.

2000 Closing the Gap: Report on the Wingspread Confer-ence “Beyond the Standards Horse Race: Implemen-tation, Assessment, and Accountability – The Keys toImproving Student Achievement.” Washington,DC: Council for Basic Education.

Education Week

2000 Quality Counts 2000: Who Should Teach?Education Week 19(18), January 13.

Education Week

2001 Quality Counts: A Better Balance—Standards,Tests, and the Tools to Succeed. Education Week20(17), January 11.

Finn, Chester E., Jr., Marci Kanstoroom, and Michael J.Petrilli

1999 The Quest for Better Teachers: Grading the States.Washington, DC: The Thomas B. FordhamFoundation.

Glidden, Heidi

1999 Making Standards Matter 1999. [Online].Available: http://www.aft.org/edissues/standards99/index.htm. [October 25, 2000].Washington, DC: American Federation ofTeachers.

Goertz, Margaret E.

2000 Implementing Standards-Based Reform:Challenges for State Policy. In Terri Dugganand Madelyn Holmes, eds., Closing the Gap:Report on the Wingspread Conference “Beyondthe Standards Horse Race: Implementation,Assessment, and Accountability – The Keys toImproving Student Achievement.” Washington,DC: Council for Basic Education.

Gormley, William T., Jr., and David L. Weimer

1999 Organizational Report Cards. Cambridge, MA:Harvard University Press.

Grasmick, Nancy S.

2000 Looking Back at a Decade of Reform: TheMaryland Standards Story. In Terri Dugganand Madelyn Holmes, eds., Closing the Gap:Report on the Wingspread Conference “Beyond theStandards Horse Race: Implementation, Assessment,and Accountability – The Keys to Improving StudentAchievement.” Washington, DC: Council forBasic Education.

Page 49: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

40

Grissmer, David, and Ann Flanagan

1998 Exploring Rapid Achievement Gains in NorthCarolina and Texas. Washington, DC: NationalEducation Goals Panel, November.

Hanushek, Eric A.

1996 Comments on Chapters Two, Three, and Four.In Helen F. Ladd, ed., Holding Schools Account-able: Performance-Based Reform in Education.Washington, DC: Brookings Institution Press.

Herrnstein, Richard J., and Charles Murray

1994 The Bell Curve: Intelligence and Class Structure inAmerican Life. New York: Free Press.

Hill, Paul T.

2000 The Changing State Role. Basic Education44(8), School Governance. Washington, DC:Council for Basic Education, June.

Hoff, David J.

2000 Testing’s Ups and Downs Predictable.Education Week 19(20), January 26.

Illinois

1998 Illinois’ Response to the 1999 Action State-ment. [Online]. Available: http:www.achieve.org/achieve/achievestart.nsfstates/Illinois [October 25, 2000].

Illinois State Board of Education

2000 Illinois State Board of Education (ISBE)Prairie State Achievement Examination(PSAE) Questions and Answers. [Online].Available: http://www.isbe.state.il.us/isatQandA.htm. [September 2000].

Kane, Thomas

No Improving K-12 School Accountability Systems.Date Cambridge, MA:

Harvard University, unpublished paper.

Keller, Bess

2000 Incentives for Test-Takers Run the Gamut.Education Week 19(34), May 3.

Klein, Stephen P., and Laura Hamilton

1999 Large-Scale Testing: Current Practices and NewDirections. Santa Monica, CA: RAND Education.

Koretz, Daniel

1996 Using Student Assessments for EducationalAccountability. In National Research Council,Improving America’s Schools: The Role of Incentives.Washington, DC: National Academy Press.

Ladd, Helen F., and Randall P. Walsh

Forthcoming Implementing Value-Added Measuresof School Effectiveness: Getting theIncentives Right. Economics ofEducation Review.

Linn, Robert L.

1999 Assessments and Accountability. EducationalResearcher 29(2): 4-16, March.

Maryland State Department of Education

2000 State Board Moves Ahead With Assessments.News Release, May 24.

McAdams, Donald R.

2000 Fighting to Save Our Urban Schools…andWinning! Lessons from Houston. New York, NY:Teachers College Press.

Mendro, Bob, Ruben Olivarez, Maureen Peters, TonyMilanowski, Eileen Kellor, and Allan Odden

1999 A Case Study of the Dallas Public School-BasedPerformance Award Program. Madison, WI:University of Wisconsin, Wisconsin Center forEducation Research, Consortium for PolicyResearch in Education.

Meyer, Robert H.

1996 Value-Added Indicators of School Perfor-mance. In National Research Council,Improving America’s Schools: The Role of Incentives.Washington, DC: National Academy Press.

Meyer, Robert H.

2000 Value-Added Indicators: A Powerful Tool forEvaluating Science and Mathematics Programsand Policies. NISE Brief (National Institute forScience Education) 3(3), June.

National Alliance of Business

2000 Making Academics Count. [Online]. Available:http://www.bcer.org/macc/. [October 31,2000].

National Center for Education Statistics

1999 Teacher Quality: A Report on the Preparation andQualifications of Public School Teachers.Washington, DC: US Department ofEducation, Office of Educational Research andImprovement, NCES 1999-080.

National Research Council

1999a High Stakes: Testing for Tracking, Promotion, andGraduation. Committee on Appropriate TestUse, Jay P. Heubert and Robert M. Hauser,eds. Commission on Behavioral and Social

Page 50: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

41

Sciences and Education. Washington, DC:National Academy Press.

National Research Council

1999b How People Learn: Brain, Mind, Experience, andSchool. Committee on Developments in theScience of Learning, John D. Bransford, AnnL. Brown, and Rod R. Cocking, eds. Commis-sion on Behavioral and Social Sciences andEducation. Washington, DC: NationalAcademy Press.

National Research Council

1999c Making Money Matter: Financing America’sSchools. Committee on Education Finance,Helen F. Ladd and Janet S. Hansen, eds.Commission on Behavioral and Social Sciencesand Education. Washington, DC: NationalAcademy Press.

National Research Council

1999d Testing, Teaching, and Learning. Committee onTitle I Testing and Assessment, Richard F.Elmore and Robert Rothman, eds. Board onTesting and Assessment, Commission onBehavioral and Social Sciences and Education.Washington, DC: National Academy Press.

National Research Council

1999e Uncommon Measures: Equivalence and LinkageAmong Educational Tests. Committee onEquivalency and Linkage of Educational Tests,Michael J. Feuer, Paul W. Holland, Bert F.Green, Meryl W. Bertenthal, and F. CadellHemphill, eds. Commission on Behavioral andSocial Sciences and Education. Washington,DC: National Academy Press.

Odden, Allan and Carolyn Kelley

1997 Paying Teachers for What They Know and Do: Newand Smarter Compensation Strategies to ImproveSchools. Thousand Oaks, CA: Corwin Press.

Office for Civil Rights

2000 The Use of Tests As Part of High-StakesDecisions for Students: A Resource Guide forEducators and Policy-Makers. [Online] Available:http://www.ed.gov/offices/ocr/testing/.[December 2000].

Olson, Lynn

2000 States Ponder New Forms of Diploma.Education Week 19(41), June 21.

Phelps, Richard P.

1999 Why Testing Experts Hate Testing. Washington,DC: Thomas B. Fordham Foundation, January.

Porter, Andrew

2000 Doing High-Stakes Assessment Right. TheSchool Administrator 57(11), December.

Public Agenda

2000a Reality Check 2000. Conducted in associationwith Education Week and funded by the PewCharitable Trusts and the GE Fund. EducationWeek 19(23), February 16.

Public Agenda

2000b Understanding the Issue: Quick Takes -Education: People’s Chief Concerns. [Online].Available: http://www.publicagenda.org/issuesangles.cfm?issue_type=education. [October 25,2000].

Robelen, Erik W.

2000 Parents Seek Civil Rights Probe of High-StakesTests in La. Education Week 20(06), October 11.

Roderick, Melissa, Anthony S. Bryk, Brian A. Jacob,John Q. Easton, and Elaine Allensworth

2000 Ending Social Promotion: Results From the FirstTwo Years. Chicago, IL: Consortium onChicago School Research, December.

Roderick, Melissa, Jenny Nagaoka, Jen Bacon, and JohnQ. Easton

2000 Update: Ending Social Promotion. Chicago, IL:Consortium on Chicago School Research,Charting Reform in Chicago Series: Data Brief,September.

Texas Education Agency

1999 1999 Accountability Manual: The 1999Accountability Rating System for Texas PublicSchools and School Districts and Blueprint forthe 2000-2003 Accountability Systems.[Online]. Available: http:/www.tea.state.tx.usperfreport/account/99/manual/. [April, 1999].

U.S. Department of Education

1999a Digest of Education Statistics, 1998. Washington,DC: National Center for Education Statistics,U.S. Department of Education.

U.S. Department of Education

1999b Promising Results, Continuing Challenges: TheFinal Report of the National Assessment of Title I.Washington, DC: Office of the UnderSecretary, Planning and Evaluation Service,U.S. Department of Education.

Page 51: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

42

MEMORANDA OF COMMENT, RESERVATION,OR DISSENT

On the report as a whole, WILLIAM E.BROCK, with which ALAN BELZER andPATRICK W. GROSS have asked to beassociated

This report is an outstanding contributionto our continuing national conversation onthe subject of using assessment and account-ability to improve learning.

I remain concerned, however, that thesediscussions have not adequately incorporatedmuch of the modern research on learning.

Children come to schools today with a hostof learning challenges. The factors range fromgenetic, to nutritional, to environmental, butthey impose a huge burden on student andteacher alike.

It is long past time we used assessments as adiagnostic tool, allowing schools and parentsto identify the cause of learning struggles andremedy them to the extent possible before thedamage is done. This can be done and is beingdone today—but all too rarely.

On the report as a whole, ROBERT C.WAGGONER, with which JAMES Q.RIORDAN has asked to be associated

I have voted to disapprove this report be-cause I am concerned that improved testingand assessment, and greater accountability inthe existing public school system do not ad-dress the essential failing of public K-12 educa-tion, which is systemic.

The public school system, as it is currentlystructured, is devoid of effective choice for allexcept those who can afford to opt out and paytuition. If our biggest public policy failing is

our neglect of poor children’s schooling ininner cities, and if our K-12 schools in generaldo not measure up to the standard of interna-tional competition, we must broaden our viewof solutions to encompass a redefinition ofpublic education which includes all schoolsthat educate the public at public expense, pro-vided they are subject to public accountabilityand curricular standards.

Market-based mechanisms work in everyother sphere and yet have not been tried in thedelivery of our most expensive publicly-financedservice, K-12 education. We have, arguably, thebest university system in the world—one basedcompletely on free choice—where there is nocompulsion to attend a college or universitybased on residence, and yet we do not offer thesame in K-12 where our failings are egregiousand scandalous.

I believe CED should support a full multi-year test of a scholarship-type voucher systemin several American cities, allowing parents toopt out of the present government-run schools,and use stipends to attend private, parochial,or for-profit schools at their own option, pro-vided that these schools adhere to state-ap-proved curricular standards, do not practicediscrimination, and offer open admission, withlotteries to determine entry if there is an ex-cess of applicants over vacancies.

I believe a full test of choice will show itscapacity to deliver superior K-12 education,probably at lower cost than at present. Theopponents of choice are blocking even a test ofthe concept because they fear its success.America’s children and the greater public aresuffering the results.

Page 52: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

43

Page 2, CAROL J. PARRY, with whichFLETCHER L. BYROM has asked to beassociated

The report may give the impression thattesting is the most important reform or change.Understanding that the purpose of the reportis to discuss testing, we have to be wary of thereport being used by those that advocate test-ing above all other changes. I believe (as men-tioned at several points in the report) that anumber of changes are equally critical: enhanc-ing the profession of teaching through appro-priate compensation, on-going training andeducation, and changing the public percep-tion of teachers so that they are well respectedin our society; improving the physical schoolfacilities that children attend so that they areclean, safe and have appropriate space for stu-dents to learn; creating an environment in thepublic schools of discipline and respect so thatteachers that want to teach and students thatwant to learn can do so without disruptionfrom those that do not share these goals; in-volving parents in both the educational pro-cess and in governance of the schools theirchildren attend; developing programs andmechanisms so that parents, teachers and ad-ministrators work as a team to improve theschools and enhance a child’s educational ex-perience. I am sure there are many more.

I also question whether standardized mul-tiple-choice tests can really test the ability tothink and reason. The report recognizes thattesting has its limitations. However, it also pro-poses that standardized tests (which at leastnow are primarily of the multiple-choice vari-ety) be used to make decisions about schoolperformance, teacher performance, and stu-dent performance and advancement. Over theyears, I have been involved in many efforts toimprove the public schools. The most excitinghave not involved testing but creative teaching—teaching that goes way beyond things youcan measure in a test: teaching children to beresponsible members of society; teaching chil-

dren to think on their feet; teaching childrento solve problems that exist in their communi-ties, not just in their text books; teaching chil-dren to appreciate art and music.

Page 4, CAROL J. PARRY, with whichFLETCHER L. BYROM has asked to beassociated

The report recognizes that creating the “per-fect” test will be very difficult, expensive, andtime consuming. If we are serious about creat-ing excellent measurement and assessmenttools that are fully comparable across states,then we need a national project, under thedirection of the federal government, but withthe input and cooperation of states to makethis happen in a quality way. Unfortunately,this may offend those who believe in local con-trol of schools.

Page 23, CAROL J. PARRY, with whichFLETCHER L. BYROM has asked to beassociated

In recommending that standardized testsbe used to determine if students get promoted,we risk creating a self-fulfilling prophecy and apermanent underclass of students. Whenschools and teachers are evaluated based ontest results, they are motivated not only to im-prove the quality of the schools so that chil-dren will get better scores, but also to weed outthose that cannot make it on these tests so thatoverall school scores will rise. The report doesrecommend that individual improvement instudent scores be part of the evaluation of theschool and the teacher, but it also suggests thatstudents that cannot make it to the next levelmay drop out—only exacerbating the problemwe have today. The report does provide sugges-tions on how to assist these students, but thesesolutions take resources, specially trained teach-ers, and special teaching environments. It isnaïve of us to think that these opportunitiesfor weaker students will exist everywhere. It is

Page 53: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

44

important that the evaluation of a school ordistrict should not only depend on the testscores but also on the school’s ability to pre-vent dropouts.

Page 28, CAROL J. PARRY, with whichFLETCHER L. BYROM has asked to beassociated

If we tie teacher evaluations and compensa-tion as well as funding for the school to testresults, teachers will feel they must teach thetest. The report suggests that changing the testevery year, not giving out questions in advance,etc. can prevent this. But just as there are verysuccessful tutoring programs for SAT’s (teststhat are supposedly kept top secret), schoolsand teachers will develop ways to teach whatthey presume will be on the tests, based on thecurriculum and established local or state stan-dards.

Page 54: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

45

OBJECTIVES OF THE COMMITTEE FOR ECONOMIC DEVELOPMENT

For more than 50 years, the Committee forEconomic Development has been a respectedinfluence on the formation of business andpublic policy. CED is devoted to these twoobjectives:

To develop, through objective research andinformed discussion, findings and recommenda-tions for private and public policy that will contrib-ute to preserving and strengthening our free society,achieving steady economic growth at high employ-ment and reasonably stable prices, increasing pro-ductivity and living standards, providing greaterand more equal opportunity for every citizen, andimproving the quality of life for all.

To bring about increasing understanding bypresent and future leaders in business, government,and education, and among concerned citizens, of theimportance of these objectives and the ways in whichthey can be achieved.

CED’s work is supported by private volun-tary contributions from business and industry,

foundations, and individuals. It is independent,nonprofit, nonpartisan, and nonpolitical.

Through this business-academic partner-ship, CED endeavors to develop policy state-ments and other research materials thatcommend themselves as guides to public andbusiness policy; that can be used as texts incollege economics and political science coursesand in management training courses; thatwill be considered and discussed by newspaperand magazine editors, columnists, and com-mentators; and that are distributed abroad topromote better understanding of the Ameri-can economic system.

CED believes that by enabling businessleaders to demonstrate constructively their con-cern for the general welfare, it is helping busi-ness to earn and maintain the national andcommunity respect essential to the successfulfunctioning of the free enterprise capitalistsystem.

Page 55: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Life Trustee

ChairmanFRANK P. DOYLE, Retired Executive Vice PresidentGE

Vice ChairmenROY J. BOSTOCK, ChairmanB/com3 Group, Inc.JOHN H. BRYAN, Chairman of the BoardSara Lee CorporationDONALD R. CALDWELL, Chairman and Chief

Executive OfficerCross Atlantic Capital PartnersRAYMOND V. GILMARTIN, Chairman, President

and Chief Executive OfficerMerck & Co., Inc.

TreasurerREGINA DOLANExecutive Vice President, Chief Financial Officer and

Chief Administrative OfficerPaineWebber Group, Inc.

ROGER G. ACKERMAN, Chairman and ChiefExecutive Officer

Corning IncorporatedREX D. ADAMS, DeanThe Fuqua School of BusinessDuke UniversityPAUL A. ALLAIRE, ChairmanXerox CorporationIAN ARNOF, Retired ChairmanBank One, Louisiana, N.A.JAMES S. BEARD, PresidentCaterpillar Financial Services Corp.HENRY P. BECTON, JR., President and

General ManagerWGBH Educational FoundationTHOMAS D. BELL, JR., Special Limited PartnerForstmann Little & Co.ALAN BELZER, Retired President and Chief

Operating OfficerAlliedSignal Inc.PETER A. BENOLIEL, Chairman, Executive CommitteeQuaker Chemical CorporationMELVYN E. BERGSTEIN, Chairman and Chief

Executive OfficerDiamond Cluster International, Inc.DEREK BOK, President EmeritusHarvard UniversityJON A. BOSCIA, President and Chief Executive OfficerLincoln National CorporationROY J. BOSTOCK, ChairmanB/com3 Group, Inc.

CED BOARD OF TRUSTEES

JOHN BRADEMAS, President EmeritusNew York UniversityWILLIAM E. BROCK, ChairmanBridges LearningSystems, Inc.STEPHEN L. BROWN, Chairman and Chief

Executive OfficerJohn Hancock Mutual Life Insurance CompanyJOHN H. BRYAN, Chairman of the BoardSara Lee CorporationTHOMAS J. BUCKHOLTZ, Executive Vice PresidentBeyond Insight CorporationMICHAEL BUNGEY, Chairman and Chief

Executive OfficerBates WorldwideW. VAN BUSSMANN, Corporate EconomistDaimlerChrysler CorporationFLETCHER L. BYROM, President and Chief

Executive OfficerMICASU CorporationDONALD R. CALDWELL, Chairman and Chief

Executive OfficerCross Atlantic Capital PartnersFRANK C. CARLUCCIWashington, D.C.MARSHALL N. CARTER, Retired Chairman and Chief

Executive OfficerState Street CorporationROBERT B. CATELL, Chairman and Chief

Executive OfficerKeySpan CorporationJOHN B. CAVE, PrincipalAvenir Group, Inc.JOHN S. CHALSTY, Senior AdvisorCredit Suisse First BostonRAYMOND G. CHAMBERS, Chairman of the BoardAmelior FoundationMARY ANN CHAMPLIN, Retired Senior Vice

PresidentAetna Inc.CLARENCE J. CHANDRAN, Chief Operating OfficerNortel Networks CorporationROBERT CHESS, ChairmanInhale Therapeutic Systems, Inc.CAROLYN CHIN, Chief Executive OfficerCebizJOHN L. CLENDENIN, Retired ChairmanBellSouth CorporationFERDINAND COLLOREDO-MANSFELD, Chairman

and Chief Executive OfficerCabot Industrial TrustGEORGE H. CONRADES, Chairman and Chief

Executive OfficerAkamai Technologies, Inc.KATHLEEN B. COOPER, Chief Economist and

Manager, Economics & Energy DivisionExxon Mobil CorporationJAMES P. CORCORAN, Executive Vice President,

Government & Industry RelationsAmerican General Corporation

*

*

Page 56: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Life Trustee

GARY L. COUNTRYMAN, Chairman EmeritusLiberty Mutual Insurance CompanySTEPHEN A. CRANE, Chairman, President and Chief

Executive OfficerStirling Cooke Brown Holdings LimitedRONALD R. DAVENPORT, Chairman of the BoardSheridan Broadcasting CorporationJOHN T. DEE, Chairman and Chief

Executive OfficerVolume Services AmericaROBERT M. DEVLIN, Chairman and Chief

Executive OfficerAmerican General CorporationJOHN DIEBOLD, ChairmanJohn Diebold IncorporatedREGINA DOLAN, Executive Vice President, Chief

Financial Officer and Chief Administrative OfficerPaineWebber Group, Inc.IRWIN DORROS, PresidentDorros AssociatesFRANK P. DOYLE, Retired Executive Vice PresidentGEE. LINN DRAPER, JR., Chairman, President and Chief

Executive OfficerAmerican Electric Power CompanyT. J. DERMOT DUNPHY, Retired ChairmanSealed Air CorporationCHRISTOPHER D. EARL, Managing DirectorPerseus Capital, LLCW. D. EBERLE, ChairmanManchester Associates, Ltd.JAMES D. ERICSON, Chairman and Chief

Executive OfficerThe Northwestern Mutual Financial NetworkWILLIAM T. ESREY, Chairman and Chief

Executive OfficerSprintKATHLEEN FELDSTEIN, PresidentEconomics Studies, Inc.RONALD E. FERGUSON, Chairman, President and

Chief Executive OfficerGeneral RE CorporationE. JAMES FERLAND, Chairman, President and Chief

Executive OfficerPublic Service Enterprise Group Inc.EDMUND B. FITZGERALD, Managing DirectorWoodmont AssociatesHARRY L. FREEMAN, ChairThe Mark Twain InstituteMITCHELL S. FROMSTEIN, Chairman EmeritusManpower Inc.JOSEPH GANTZ, PartnerGG Capital, LLCE. GORDON GEE, ChancellorVanderbilt UniversityTHOMAS P. GERRITY, Dean EmeritusThe Wharton SchoolUniversity of PennsylvaniaRAYMOND V. GILMARTIN, Chairman, President and

Chief Executive OfficerMerck & Co., Inc.

FREDERICK W. GLUCK, Of CounselMcKinsey & Company, Inc.CAROL R. GOLDBERG, PresidentThe Avcar Group, Ltd.ALFRED G. GOLDSTEIN, President and Chief

Executive OfficerAG AssociatesJOSEPH T. GORMAN, Chairman and Chief

Executive OfficerTRW Inc.RICHARD A. GRASSO, Chairman and Chief

Executive OfficerNew York Stock Exchange, Inc.EARL G. GRAVES, SR., Publisher and Chief

Executive OfficerBlack Enterprise MagazineWILLIAM H. GRAY, III, President and Chief

Executive OfficerThe College FundROSEMARIE B. GRECO, PrincipalGreco VenturesGERALD GREENWALD, Chairman EmeritusUAL CorporationBARBARA B. GROGAN, PresidentWestern Industrial ContractorsPATRICK W. GROSS, Founder and Chairman,

Executive CommitteeAmerican Management Systems, Inc.JEROME GROSSMAN, Chairman and Chief

Executive OfficerLion Gate Management CorporationRONALD GRZYWINSKI, Chairman and Chief

Executive OfficerShorebank CorporationJUDITH H. HAMILTON, President and Chief

Executive OfficerClassroom ConnectWILLIAM A. HASELTINE, Chairman and Chief

Executive OfficerHuman Genome Sciences, Inc.WILLIAM F. HECHT, Chairman, President and Chief

Executive OfficerPPL CorporationJOSEPH D. HICKS, Retired President and Chief

Executive OfficerSiecor CorporationHEATHER HIGGINS, PresidentRandolph FoundationSTEVEN R. HILL, Senior Vice President,

Human ResourcesWeyerhaeuser CompanyRODERICK M. HILLS, PresidentHills Enterprises, Ltd.HAYNE HIPP, President and Chief Executive OfficerThe Liberty CorporationRONALD N. HOGE, Former President and Chief

Executive OfficerMagneTek, Inc.DEBORAH C. HOPKINS, Executive Vice President

and Chief Financial OfficerLucent Technologies Inc.

*

Page 57: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Life Trustee

MATINA S. HORNER, Executive Vice PresidentTIAA-CREFPHILIP K. HOWARD, Vice ChairmanCovington & BurlingROBERT J. HURST, Vice ChairmanGoldman, Sachs & Co.ALICE STONE ILCHMANBronxville, New YorkWILLIAM C. JENNINGS, Chief Executive OfficerUS Interactive, Inc.JEFFREY A. JOERRES, President and Chief

Executive OfficerManpower Inc.JAMES A. JOHNSON, Chairman and Chief

Executive OfficerJohnson Capital PartnersROBERT M. JOHNSON, Chairman and Chief

Executive OfficerBowne & Co., Inc.H.V. JONES, Office Managing DirectorKorn/Ferry International, Inc.PRES KABACOFF, President and Co-ChairmanHistoric Restoration, Inc.JOSEPH J. KAMINSKI, Corporate Executive

Vice PresidentAir Products and Chemicals, Inc.EDWARD A. KANGAS, Chairman, RetiredDeloitte Touche TohmatsuJOSEPH E. KASPUTYS, Chairman, President and

Chief Executive OfficerThomson Financial PrimarkJAMES P. KELLY, Chairman and Chief

Executive OfficerUnited Parcel Service Inc.THOMAS J. KLUTZNICK, PresidentThomas J. Klutznick CompanyCHARLES F. KNIGHT, Chairman and Chief

Executive OfficerEmerson Electric Co.CHARLES E.M. KOLB, PresidentCommittee for Economic DevelopmentALLEN J. KROWE, Retired Vice ChairmanTexaco Inc.C. JOSEPH LABONTE, ChairmanThe Vantage GroupKURT M. LANDGRAF, President and Chief

Executive OfficerEducational Testing ServiceJ. TERRENCE LANNI, Chairman of the BoardMGM MirageCHARLES R. LEE, Chairman and Co-Chief

Executive OfficerVerizon CommunicationsROBERT H. LESSIN, ChairmanWit Soundview Group, Inc.WILLIAM W. LEWIS, DirectorMcKinsey Global InstituteMcKinsey & Company, Inc.IRA A. LIPMAN, Chairman of the Board and PresidentGuardsmark, Inc.

MICHAEL D. LOCKHART, Chairman and ChiefExecutive Officer

Armstrong Holdings, Inc.JOSEPH T. LYNAUGH, Retired President and Chief

Executive OfficerNYLCare Health Plans, Inc.BRUCE K. MACLAURY, President EmeritusThe Brookings InstitutionCOLETTE MAHONEY, RSHM, President EmeritusMarymount Manhattan CollegeELLEN R. MARRAM, PartnerNorth Castle PartnersALONZO L. MCDONALD, Chairman and Chief

Executive OfficerAvenir Group, Inc.EUGENE R. MCGRATH, Chairman, President and

Chief Executive OfficerConsolidated Edison Company of New York, Inc.DAVID E. MCKINNEY, PresidentThe Metropolitan Museum of ArtDEBORAH HICKS MIDANEK, PrincipalGlass & AssociatesHARVEY R. MILLER, Senior PartnerWeil, Gotshal & MangesNICHOLAS G. MOORE, ChairmanPricewaterhouseCoopersDIANA S. NATALICIO, PresidentThe University of Texas at El PasoMARILYN CARLSON NELSON, Chairman, President

and Chief Executive OfficerCarlson Companies, Inc.MATTHEW NIMETZ, PartnerGeneral Atlantic PartnersTHOMAS H. O’BRIEN, Chairman of the BoardPNC Financial Services Group, Inc.LEO J. O’DONOVAN, S.J., PresidentGeorgetown UniversityDEAN R. O’HARE, Chairman and Chief

Executive OfficerChubb CorporationJOHN D. ONG, Chairman EmeritusThe BFGoodrich CompanyROBERT J. O'TOOLE, Chairman and Chief

Executive OfficerA.O. Smith CorporationSTEFFEN E. PALKO, Vice Chairman and PresidentCross Timbers Oil CompanySANDRA PANEM, PartnerCross Atlantic Partners, Inc.CAROL J. PARRY, Retired Executive Vice PresidentThe Chase Manhattan CorporationVICTOR A. PELSON, Senior Advisor Warburg Dillon Read LLCDONALD K. PETERSON, President and Chief

Executive OfficerAvaya Inc.PETER G. PETERSON, ChairmanThe Blackstone GroupRAYMOND PLANK, Chairman and Chief

Executive OfficerApache Corporation

Page 58: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Life Trustee

ARNOLD B. POLLARD, President and ChiefExecutive Officer

The Chief Executive GroupS. LAWRENCE PRENDERGAST, Executive Vice

President of FinanceLaBranche & Co.HUGH B. PRICE, President and Chief

Executive OfficerNational Urban LeagueNED REGAN, PresidentBaruch CollegeWILLIAM R. RHODES, Vice ChairmanCitigroup Inc.JAMES Q. RIORDANQuentin Partners Inc.E. B. ROBINSON, JR., Chairman EmeritusDeposit Guaranty CorporationROY ROMERFormer Governor of ColoradoSuperintendent, Los Angeles Unified School DistrictDANIEL ROSE, ChairmanRose Associates, Inc.HOWARD M. ROSENKRANTZ, Chief Executive OfficerGrey Flannel AuctionsMICHAEL I. ROTH, Chairman and Chief

Executive OfficerThe MONY Group Inc.LANDON H. ROWLAND, Chairman, President and

Chief Executive OfficerStillwell Financial Inc.NEIL L. RUDENSTINE, PresidentHarvard UniversityWILLIAM RUDER, PresidentWilliam Ruder IncorporatedGEORGE RUPP, PresidentColumbia UniversityGEORGE F. RUSSELL, JR., ChairmanFrank Russell CompanyEDWARD B. RUST, JR., Chairman and Chief

Executive OfficerState Farm Insurance CompaniesARTHUR F. RYAN, Chairman and Chief

Executive OfficerThe Prudential Insurance Company of AmericaMARGUERITE W. SALLEE, Chairman and Chief

Executive OfficerFrontline GroupSTEPHEN W. SANGER, Chairman and Chief

Executive OfficerGeneral Mills, Inc.HENRY B. SCHACHT, Director and Senior AdvisorE.M. Warburg, Pincus & Co. LLCMICHAEL M. SEARS, Senior Vice President and

Chief Financial OfficerThe Boeing CompanyWALTER H. SHORENSTEIN, Chairman of the BoardThe Shorenstein CompanyGEORGE P. SHULTZ, Distinguished FellowThe Hoover InstitutionStanford University

JOHN C. SICILIANOSan Marino, CaliforniaRUTH J. SIMMONS, PresidentSmith CollegeRONALD L. SKATES, Former President and Chief

Executive OfficerData General CorporationFREDERICK W. SMITH, Chairman, President and

Chief Executive OfficerFederal Express CorporationHUGO FREUND SONNENSCHEIN, President EmeritusThe University of ChicagoALAN G. SPOON, Managing General PartnerPolaris VenturesJOHN R. STAFFORD, Chairman, President and

Chief Executive OfficerAmerican Home Products CorporationSTEPHEN STAMAS, ChairmanThe American AssemblyJOHN L. STEFFENS, Vice ChairmanMerrill Lynch & Co., Inc.PAULA STERN, PresidentThe Stern Group, Inc.DONALD M. STEWART, President and Chief

Executive OfficerChicago Community TrustROGER W. STONE, Chairman and Chief

Executive OfficerBox USA Group, Inc.MATTHEW J. STOVER, Presidentedu.comRICHARD J. SWIFT, Chairman, President and Chief

Executive OfficerFoster Wheeler CorporationRICHARD F. SYRON, President and Chief

Executive OfficerThermo Electron CorporationHENRY TANG, ChairmanCommittee of 100FREDERICK W. TELLING, Vice President Corporate

Strategic Planning and Policy DivisionPfizer Inc.RICHARD L. THOMAS, Retired ChairmanFirst Chicago NBD CorporationJAMES A. THOMSON, President and Chief

Executive OfficerRANDCHANG-LIN TIEN, NEC Distinguished Professor of

EngineeringUniversity of California, BerkeleyTHOMAS J. TIERNEY, Director (former Chief Executive)Bain & CompanySTOKLEY P. TOWLES, PartnerBrown Brothers Harriman & Co.ALAIR A. TOWNSEND, PublisherCrain’s New York BusinessSTEPHEN JOEL TRACHTENBERG, PresidentGeorge Washington UniversityJAMES L. VINCENT, Chairman and Chief

Executive OfficerBiogen, Inc.

*

Page 59: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Life Trustee

ROBERT C. WAGGONER, Chief Executive OfficerBurrelle’s Information Services LLCDONALD C. WAITE, III, DirectorMcKinsey & Company, Inc.HERMINE WARREN, PresidentHermine Warren Associates, Inc.ARNOLD R. WEBER, President EmeritusNorthwestern UniversityJOHN F. WELCH, JR., Chairman and Chief

Executive OfficerGEJOSH S. WESTON, Honorary ChairmanAutomatic Data Processing, Inc.CLIFTON R. WHARTON, JR., Former Chairman

and Chief Executive OfficerTIAA-CREFDOLORES D. WHARTON, Chairman and Chief

Executive OfficerThe Fund for Corporate Initiatives, Inc.MICHAEL W. WICKHAM, Chairman and Chief

Executive OfficerRoadway Express, Inc.HAROLD M. WILLIAMS, Of CounselSkadden Arps Slate Meagher & Flom LLPJ. KELLEY WILLIAMS, Chairman and Chief

Executive OfficerChemFirst Inc.LINDA SMITH WILSON, President EmeritaRadcliffe CollegeMARGARET S. WILSON, Chairman and Chief

Executive OfficerScarbroughsKURT E. YEAGER, President and Chief Executive

OfficerElectric Power Research InstituteMARTIN B. ZIMMERMAN, Vice President,

Governmental AffairsFord Motor Company

Page 60: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

RAY C. ADAM, Retired ChairmanNL IndustriesO. KELLEY ANDERSONBoston, MassachusettsROBERT O. ANDERSON, Retired ChairmanHondo Oil & Gas CompanyROY L. ASHLos Angeles, CaliforniaSANFORD S. ATWOOD, President EmeritusEmory UniversityROBERT H. B. BALDWIN, Retired ChairmanMorgan Stanley Group Inc.GEORGE F. BENNETT, Chairman EmeritusState Street Investment TrustHAROLD H. BENNETTSalt Lake City, UtahJACK F. BENNETT, Retired Senior Vice PresidentExxon CorporationHOWARD W. BLAUVELTKeswick, VirginiaMARVIN BOWERDelray Beach, FloridaALAN S. BOYDLady Lake, FloridaANDREW F. BRIMMER, PresidentBrimmer & Company, Inc.HARRY G. BUBB, Chairman EmeritusPacific Mutual Life InsuranceTHEODORE A. BURTIS, Retired Chairman

of the BoardSun Company, Inc.PHILIP CALDWELL, Chairman (Retired)Ford Motor CompanyEVERETT N. CASECooperstown, New YorkHUGH M. CHAPMAN, Retired ChairmanNationsBank SouthE. H. CLARK, JR., Chairman and Chief

Executive OfficerThe Friendship GroupA.W. CLAUSEN, Retired Chairman and Chief

Executive OfficerBankAmerica CorporationDOUGLAS D. DANFORTHExecutive AssociatesJOHN H. DANIELS, Retired Chairman and

Chief Executive OfficerArcher-Daniels Midland Co.RALPH P. DAVIDSONWashington, D.C.ALFRED C. DECRANE, JR., Retired Chairman and

Chief Executive OfficerTexaco, Inc.ROBERT R. DOCKSON, Chairman EmeritusCalFed, Inc.LYLE EVERINGHAM, Retired ChairmanThe Kroger Co.THOMAS J. EYERMAN, Retired PartnerSkidmore, Owings & MerrillJOHN T. FEYPark City, Utah

CED HONORARY TRUSTEES

JOHN M. FOXSapphire, North CarolinaDON C. FRISBEE, Chairman EmeritusPacifiCorpRICHARD L. GELB, Chairman EmeritusBristol-Myers Squibb CompanyW. H. KROME GEORGE, Retired ChairmanALCOAWALTER B. GERKEN, Chairman, Executive CommitteePacific Life Insurance CompanyPAUL S. GEROTDelray Beach, FloridaLINCOLN GORDON, Guest ScholarThe Brookings InstitutionKATHARINE GRAHAM, Chairman of the

Executive CommitteeThe Washington Post CompanyJOHN D. GRAY, Chairman EmeritusHartmarx CorporationJOHN R. HALL, Retired ChairmanAshland Inc.RICHARD W. HANSELMAN, ChairmanHealth Net Inc.ROBERT A. HANSON, Retired ChairmanDeere & CompanyROBERT S. HATFIELD, Retired ChairmanThe Continental Group, Inc.ARTHUR HAUSPURG, Member, Board of TrusteesConsolidated Edison Company of New York, Inc.PHILIP M. HAWLEY, Retired Chairman of the BoardCarter Hawley Hale Stores, Inc.ROBERT C. HOLLAND, Senior FellowThe Wharton SchoolUniversity of PennsylvaniaLEON C. HOLT, JR., Retired Vice ChairmanAir Products and Chemicals, Inc.SOL HURWITZ, Retired PresidentCommittee for Economic DevelopmentGEORGE F. JAMESPonte Vedra Beach, FloridaDAVID KEARNS, Chairman EmeritusNew American SchoolsGEORGE M. KELLER, Chairman of the Board, RetiredChevron CorporationFRANKLIN A. LINDSAY, Retired ChairmanItek CorporationROY G. LUCKSSan Francisco, CaliforniaROBERT W. LUNDEEN, Retired ChairmanThe Dow Chemical CompanyIAN MACGREGOR, Retired ChairmanAMAX Inc.RICHARD B. MADDEN, Retired Chairman and

Chief Executive OfficerPotlatch CorporationFRANK L. MAGEEStahlstown, PennsylvaniaSTANLEY MARCUS, ConsultantStanley Marcus ConsultancyAUGUSTINE R. MARUSILake Wales, Florida

Page 61: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

JAMES J. RENIERRenier & AssociatesIAN M. ROLLAND, Former Chairman and Chief

Executive OfficerLincoln National CorporationAXEL G. ROSIN, Retired ChairmanBook-of-the-Month Club, Inc.WILLIAM M. ROTHPrinceton, New JerseyRALPH S. SAUL, Former Chairman of the BoardCIGNA CompaniesGEORGE A. SCHAEFER, Retired Chairman of the BoardCaterpillar, Inc.ROBERT G. SCHWARTZNew York, New YorkMARK SHEPHERD, JR., Retired ChairmanTexas Instruments, Inc.ROCCO C. SICILIANOBeverly Hills, CaliforniaDAVIDSON SOMMERSWashington, D.C.ELMER B. STAATS, Former Controller

General of the United StatesFRANK STANTON, Former PresidentCBS, Inc.EDGAR B. STERN, JR., Chairman of the BoardRoyal Street CorporationALEXANDER L. STOTTFairfield, ConnecticutWAYNE E. THOMPSON, Past ChairmanMerritt Peralta Medical CenterCHARLES C. TILLINGHAST, JR.Providence, Rhode IslandHOWARD S. TURNER, Retired ChairmanTurner Construction CompanyL. S. TURNER, JR.Dallas, TexasTHOMAS A. VANDERSLICETAV AssociatesJAMES E. WEBBWashington, D.C.SIDNEY J. WEINBERG, JR., Senior PartnerThe Goldman Sachs Group, L.P.ROBERT C. WINTERS, Chairman EmeritusPrudential Insurance Company of AmericaARTHUR M. WOODChicago, IllinoisRICHARD D. WOOD, DirectorEli Lilly and CompanyWILLIAM S. WOODSIDE, Retired ChairmanLSG Sky ChefsCHARLES J. ZWICKCoral Gables, Florida

WILLIAM F. MAY, Chairman and ChiefExecutive Officer

Statue of Liberty-Ellis Island Foundation, Inc.OSCAR G. MAYER, Retired ChairmanOscar Mayer & Co.GEORGE C. MCGHEE, Former U.S. Ambassador

and Under Secretary of StateJOHN F. MCGILLICUDDY, Retired Chairman

and Chief Executive OfficerChemical Banking CorporationJAMES W. MCKEE, JR., Retired ChairmanCPC International, Inc.CHAMPNEY A. MCNAIR, Retired Vice ChairmanTrust Company of GeorgiaJ. W. MCSWINEY, Retired Chairman of the BoardThe Mead CorporationROBERT E. MERCER, Retired ChairmanThe Goodyear Tire & Rubber Co.RUBEN F. METTLER, Retired Chairman and

Chief Executive OfficerTRW Inc.LEE L. MORGAN, Former Chairman of the BoardCaterpillar, Inc.ROBERT R. NATHAN, ChairmanNathan Associates, Inc.J. WILSON NEWMAN, Retired ChairmanDun & Bradstreet CorporationJAMES J. O’CONNOR, Former Chairman and Chief

Executive OfficerUnicom CorporationLEIF H. OLSEN, PresidentLHO GroupNORMA PACE, PresidentPaper Analytics AssociatesCHARLES W. PARRY, Retired ChairmanALCOAWILLIAM R. PEARCE, DirectorAmerican Express Mutual FundsJOHN H. PERKINS, Former PresidentContinental Illinois National Bank and Trust CompanyRUDOLPH A. PETERSON, President and Chief

Executive Officer (Emeritus)BankAmerica CorporationDEAN P. PHYPERSNew Canaan, ConnecticutEDMUND T. PRATT, JR., Retired Chairman and

Chief Executive OfficerPfizer Inc.ROBERT M. PRICE, Retired Chairman and

Chief Executive OfficerControl Data CorporationR. STEWART RAUCH, Former ChairmanThe Philadelphia Savings Fund Society

Page 62: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

CHARLES E.M. KOLBPresident

CED PROFESSIONAL AND ADMINISTRATIVE STAFF

Communications/Government RelationsCLAUDIA P. FEUREYVice President for Communications

and Corporate Affairs

MICHAEL J. PETROVice President and Director of Business and Government Policy

and Chief of Staff

CHRIS DREIBELBISBusiness and Government Policy

Associate

JIM KAPSISPublic Affairs Associate

VALERIE MENDELSOHNConference Manager and Secretary of

the Research and Policy Committee

KIMBERLY MOOREPublic Affairs and Development Assistant

ROBIN SAMERSPublic Affairs Associate

HELENA ZYBLIKEWYCZManager of Business and Public Affairs

DevelopmentMARTHA HOULEVice President for Development and

Secretary of the Board of Trustees

GLORIA Y. CALHOUNDevelopment Assistant

SANDRA L. MARTINEZDevelopment Associate

RICHARD M. RODEROAssistant Director of Development

Finance and AdministrationCARY DAVISDirector of Finance and Administration

KAREN CASTROAccounting Manager

PETER E. COXOperations Manager

SHARON A. FOWKESExecutive Assistant to the President

ARLENE M. MURPHYExecutive Assistant to the President and Office Manager

AMANDA TURNEROffice Manager

ResearchVAN DOORN OOMSSenior Vice President and

Director of Research

JANET HANSENVice President and Director

of Education Studies

ELLIOT SCHWARTZVice President and Director

of Economic Studies

TAREK ANANDANResearch Associate

MICHAEL BERGResearch Associate

SETH TUROFFResearch Associate

Advisor on InternationalEconomic PolicyISAIAH FRANKWilliam L. Clayton Professor

of International EconomicsThe Johns Hopkins University

CED RESEARCH ADVISORY BOARD

ChairmanJOHN P. WHITEProfessorHarvard UniversityJohn F. Kennedy School of Government

JAGDISH BHAGWATIArthur Lehman Professor of EconomicsColumbia University

RALPH D. CHRISTYJ. Thomas Clark ProfessorDepartment of Agricultural, Resource, and Managerial EconomicsCornell University

JOHN COGANSenior FellowThe Hoover InstitutionStanford University

ALAIN C. ENTHOVENMarriner S. Eccles Professor of Public

and Private ManagementStanford UniversityGraduate School of Business

RONALD F. FERGUSONProfessorHarvard UniversityJohn F. Kennedy School of Government

BENJAMIN M. FRIEDMANWilliam Joseph Maier Professor of Political EconomyHarvard University

ROBERT W. HAHNResident ScholarAmerican Enterprise Institute

HELEN F. LADDProfessor of Public Policy Studies and

EconomicsSanford Institute of Public PolicyDuke University

LINDA YUEN-CHING LIMProfessorUniversity of Michigan Business School

ROBERT LITANVice President, Director of Economic

StudiesThe Brookings Institution

PAUL ROMERSTANCO 25 Professor of EconomicsStanford UniversityGraduate School of Business

CECILIA E. ROUSEAssociate Professor of Economics

and Public AffairsPrinceton UniversityWoodrow Wilson School

DAVID WESSELAssistant Bureau Chief/ColumnistThe Wall Street Journal

Page 63: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

*Statements issued in association with CED counterpart organizations in foreign countries.

STATEMENTS ON NATIONAL POLICY ISSUED BY THECOMMITTEE FOR ECONOMIC DEVELOPMENT

SELECTED PUBLICATIONS:

Improving Global Financial Stability (2000)The Case for Permanent Normal Trade Relations with China (2000)Welfare Reform and Beyond: Making Work Work (2000)Breaking the Litigation Habit: Economic Incentives for Legal Reform (2000)New Opportunities for Older Workers (1999)Investing in the People's Business: A Business Proposal for Campaign Finance Reform (1999)The Employer’s Role in Linking School and Work (1998)Employer Roles in Linking School and Work: Lessons from Four Urban Communities (1998)America’s Basic Research: Prosperity Through Discovery (1998)Modernizing Government Regulation: The Need For Action (1998)U.S. Economic Policy Toward The Asia-Pacific Region (1997)Connecting Inner-City Youth To The World of Work (1997)Fixing Social Security (1997)Growth With Opportunity (1997)American Workers and Economic Change (1996)Connecting Students to a Changing World: A Technology Strategy for Improving Mathematics and

Science Education (1995)Cut Spending First: Tax Cuts Should Be Deferred to Ensure a Balanced Budget (1995)Rebuilding Inner-City Communities: A New Approach to the Nation’s Urban Crisis (1995)Who Will Pay For Your Retirement? The Looming Crisis (1995)Putting Learning First: Governing and Managing the Schools for High Achievement (1994)Prescription for Progress: The Uruguay Round in the New Global Economy (1994)*From Promise to Progress: Towards a New Stage in U.S.-Japan Economic Relations (1994)U.S. Trade Policy Beyond The Uruguay Round (1994)In Our Best Interest: NAFTA and the New American Economy (1993)What Price Clean Air? A Market Approach to Energy and Environmental Policy (1993)Why Child Care Matters: Preparing Young Children For A More Productive America (1993)Restoring Prosperity: Budget Choices for Economic Growth (1992)The United States in the New Global Economy: A Rallier of Nations (1992)The Economy and National Defense: Adjusting to Cutbacks in the Post-Cold War Era (1991)Politics, Tax Cuts and the Peace Dividend (1991)The Unfinished Agenda: A New Vision for Child Development and Education (1991)Foreign Investment in the United States: What Does It Signal? (1990)An America That Works: The Life-Cycle Approach to a Competitive Work Force (1990)Breaking New Ground in U.S. Trade Policy (1990)

Page 64: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

Battling America's Budget Deficits (1989)*Strengthening U.S.-Japan Economic Relations (1989)Who Should Be Liable? A Guide to Policy for Dealing with Risk (1989)Investing in America's Future: Challenges and Opportunities for Public Sector Economic

Policies (1988)Children in Need: Investment Strategies for the Educationally Disadvantaged (1987)Finance and Third World Economic Growth (1987)Reforming Health Care: A Market Prescription (1987)Work and Change: Labor Market Adjustment Policies in a Competitive World (1987)Leadership for Dynamic State Economies (1986)Investing in Our Children: Business and the Public Schools (1985)Fighting Federal Deficits: The Time for Hard Choices (1985)Strategy for U.S. Industrial Competitiveness (1984)Productivity Policy: Key to the Nation's Economic Future (1983)Energy Prices and Public Policy (1982)Public-Private Partnership: An Opportunity for Urban Communities (1982)Reforming Retirement Policies (1981)Transnational Corporations and Developing Countries: New Policies for a Changing

World Economy (1981)Stimulating Technological Progress (1980)Redefining Government's Role in the Market System (1979)Jobs for the Hard-to-Employ: New Directions for a Public-Private Partnership (1978)

Page 65: MEASURING WHAT MATTERS · CHAPTER 1: INTRODUCTION AND SUMMARY 1 CHAPTER 2: MEASURING STUDENT ACHIEVEMENT 5 Issues in designing direct measures of student learning 5 Setting standards

CE Circulo de Empresarios

Madrid, Spain

CEDA Committee for Economic Development of Australia

Sydney, Australia

EVA Centre for Finnish Business and Policy Studies

Helsinki, Finland

FAE Forum de Administradores de Empresas

Lisbon, Portugal

FDE Belgian Enterprise Foundation

Brussels, Belgium

IDEP Institut de l’Entreprise

Paris, France

IW Institut der Deutschen Wirtschaft

Cologne, Germany

Keizai Doyukai

Tokyo, Japan

SMO Stichting Maatschappij en Onderneming

The Netherlands

SNS Studieförbundet Naringsliv och Samhälle

Stockholm, Sweden

CED COUNTERPART ORGANIZATIONS

Close relations exist between the Committee for Economic Development andindependent, nonpolitical research organizations in other countries. Such counter-part groups are composed of business executives and scholars and have objec-tives similar to those of CED, which they pursue by similarly objective methods.CED cooperates with these organizations on research and study projects ofcommon interest to the various countries concerned. This program has resultedin a number of joint policy statements involving such international matters asenergy, East-West trade, assistance to developing countries, and the reductionof nontariff barriers to trade.