Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better...

46
Editor: Carey Reid Design: Marina Blanter spring 2004 volume sixteen

Transcript of Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better...

Page 1: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

Editor: Carey Reid Design: Marina Blanter

s p r i n g 2 0 0 4

v o l u m e s i x t e e n

Page 2: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

Adventures in Assessment is published by SABES, the System for Adult Basic Education Support. Withfunding from the Massachusetts Department of Education, Adult and Community Learning Services,SABES promotes high quality adult basic education services through training, support, and resourcesthat improve the skills and knowledge of practitioners and strengthen programs. Our approach to staff development encourages participation, reflection, and innovation. SABES serves Massachusetts adulteducation staff and programs through five Regional Support Centers and a Central Resource Center.

SABES maintains past issues of Adventures of Assessment online at http://www.sabes.org/resources/adventures/, where readers can download any of the following:

• Vol. 15 (Spring 2003): Assessment in Motion

• Vol. 14 (Spring 2002): Examining Performance

• Vol. 13 (Spring 2001): Meeting the Accountability Challenge

• Vol. 12 (Winter 2000): Experiences with Standards-Based Reform

• Vol. 11 (Winter 1998): Aspects, Levels, and Perspectives

• Vol. 10 (Winter 1997): Time to Reflect

• Vol. 9 (Winter 1996): Rethinking Assessment

• Vol. 8 (Winter 1995): Meeting the Challenge of Authentic Assessment

• Vol. 7 (December 1994): The Partnership Project

• Vol. 6 (April 1994): Responding to the Dream Conference

• Vol. 5 (October 1993): The Tale of the Tools

• Vol. 4 (April 1993): Goal-setting, tape journals, math, and other topics

• Vol. 3 (November 1992): Looking Back, Starting Again: Tools and proceduresused at the end of a cycle or term, including self, class, and group evaluation by both teachers and learners.

• Vol. 2 (May 1992): Ongoing: Tools for ongoing assessment as partof the learning process.

• Vol. 1 (November 1991): Getting Started: Start-up and intake activities.

Opinions expressed in Adventures in Assessment are those of the authors and not necessarily the opinions of SABES or its founders. Permission is granted to reproduce portions of this journal; we request, however, appropriate credit be given to the authors and to Adventures in Assessment.

Adventures in Assessment is free to Massachusetts DOE-funded programs; out-of-state requests will be charged a nominal fee. Please email requests to [email protected].

Page 3: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

B Y C A R E Y R E I D

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 V O L U M E 1 6

W e l c o m e

n the Fall of 2003, I began planning out this issue with Marie Cora, the former

Staff Development Specialist for Assessment here at World Education. Marie

and I agreed that we wanted this issue to keep very close to adult basic

education classrooms.

The articles that we sought from the fine writers who have since contributed reflect

that intention. You’ll find articles about integrating student goal setting into classes,

using data to improve programs, and using the Arlington Education & Employment

Program (REEP) writing assessment to inform instruction. You’ll also find two similar

articles that explain basic elements of standardized testing in laypersons’ language.

One of these articles has been specifically designed for teachers to use with their

students, both to educate them about the many standardized tests that pervade

our culture and to help them prepare themselves more effectively for test taking.

We hope you find these articles useful.

My best,

P. Carey Reid

Staff Development Specialist

System for Adult Basic Education Support

i

Page 4: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A D V E N T U R E S I N A S S E S S M E N TPA G E 2V O L U M E 1 6

Page 5: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

Table of Co n te n ts

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 V O L U M E 1 6

How to Do Your Best on Standardized Tests: Some Suggestions

for Adult Learners . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . page 5

Ronald K. Hambleton and Stephen Jirka provide an article written for teachers

to use with their adult learners. The authors’ intention is to empower learners

by providing them with jargon-free explanations of the basics of standardized

testing and several proven test-taking strategies.

Using the REEP Assessment for ESOL and ABE Classroom Instruction . . . . . . . . . . . page 13

Joanne Pinsonneault and Carey Reid are veteran teachers who have found

good ways to use the REEP writing assessment for instructional purposes.

Integrating Goal Setting into Instructional Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . page 25

The staff at The Center for New Americans in Western Massachusetts

take the general goals that newly enrolled learners set at intake and

use them in rich and constructive ways in the classroom.

A Basic Primer for Understanding Standardized Tests and Using

Test Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . page 29

April L. Zenisky, Lisa A. Keller, and Stephen G. Sireci have written a companion

piece to the learner-friendly article that opens this issue. The authors’ intentions

are to help practitioners understand key concepts underlying standardized testing

and to interpret test scores effectively.

Using Data for Program Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . page 35

Luanne Teller and her staff at the Stoughton (Massachusetts) Adult Education

Program have been digging deeper and deeper into attendance and other data

and using them to strengthen their program.

ACLS, SABES, and UMass: Perfect Together! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . page 39

Stephen Sireci offers an upbeat overview of the projects he and his colleagues

at UMass Amherst are working on with MassDOE, SABES, and adult basic

education practitioners.

Page 6: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A DV E N T U R E S I N A S S E S S M E N TPA G E 4V O L U M E 1 6

Page 7: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A D V E N T U R E S I N A S S E S S M E N T V O L U M E 1 6

H ow to Do Your Best on Sta n d a rd i zed Tests: Some Su g g estions for Adult Lea r n e rs

Ronald K. Hambleton and

Stephen Jirka provide an

article written for teachers

to use with their adult

learners. The authors’

intention is to empower

learners by providing

them with jargon-free

explanations of the basics

of standardized testing

and several proven

test-taking strategies.

PA G E 5

B Y R O N A L D K .H A M B L E T O N A N DS T E P H E N J I R K A

PART ONE: WHAT ARE “STANDARDIZEDTESTS”?

ducational tests, sometimescalled “standardized tests,” seem to be everywhere. In Massachusetts, the

Department of Education administers theMassachusetts Comprehensive AssessmentSystem tests (better known as MCAS) inEnglish Language Arts, Mathematics, andScience to learners in the public schoolsystem. The Educational Testing Serviceadministers the Scholastic Assessment Test(the SAT) to learners who are consideringgoing to college. The American Council onEducation administers the Tests of GeneralEducational Development (GED). You can’tget a high school diploma, go to college,join the military, get a professional licenseor certificate, or get a job without passinga test. You can’t even get a driver’s licensewithout passing a test. With so many stan-dardized tests around, adult learners wouldbe wise to learn how to do their best onthem, and to help their children do well onthem, too.

Standardized tests are particular kindsof tests, different from the final examina-tion a high school teacher might design forher math course, or the writing exercise anESOL teacher might design to see how wellhis learners are doing. When talking abouttests, “standardized” simply means thateveryone who takes the test is given thesame amount of time and sees the same

or very similar test questions.“Standardized” also means scoring isdone very carefully so that test scores do not depend upon who happens to bedoing the scoring. Why are standardizedtests so widely used? Because, by andlarge, they have been shown to be (1) an efficient way to collect informationabout what people know and can do, (2) objective in the sense that test scoresdo not depend to any great extent onwho happens to score the answers, (3)valid in that they often provide relevantand useful data for making decisionsabout mastery of a body of knowledgeand skills and potential for success, and (4) convenient and cost-effectivebecause they can be administered to many people at the same time.

Governments, the armed services,industry, universities and colleges, credentialing agencies, and many othergroups use standardized tests becausethey are convinced by the evidence thatsuch tests offer the best basis for makingdecisions about who has the necessaryknowledge and skills for some particularpurpose (like going on to college orbeing hired for a job.) Human beingsmake tests and human beings administerthem, and all human beings have biases.Bias can sometimes creep into standard-ized tests, but it can usually be spottedand the problem fixed, or, if the problemof bias cannot be solved, the test can beeliminated.

e

Page 8: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

Some people believe that standardizedtests are used too often and that thereare better ways to measure ability andreadiness. For purposes of discussion,let’s consider the task of determiningwhether adult learners have the sameknowledge and skills as high school graduates. This is an important task in the United States, because high schooldiplomas are an entry to higher educa-tion, the military, and lots of jobs. Manyadults who did not obtain a high schooldiploma during their teens later want to demonstrate that they have about the same level of knowledge and skills ashigh school graduates and thereby gainthe same opportunities. Today we havethe Tests of General EducationalDevelopment (the five tests that make up the GED), which are used around the country as a way for people todemonstrate they have knowledge andskills equivalent to those of high schoolgraduates. One alternative to passing the GED would be for adult learners to return to high school and take regularschool tests along with state graduationtests, but with a million persons desiringGED certificates each year, this wouldsurely be impractical. The external diploma programs offered by many adultbasic education programs are an excellent alternative, but they require a great deal of individual conferencing.

As a standardized test, then, the GEDcertainly has its place. It provides manythousands of adult learners in this coun-try with a second chance. Teachers arefamiliar with the material covered by GEDtests, so they can design test preparationinstruction effectively. And the GED iswidely accepted as a high school equiva-lent: community colleges, universities, themilitary, skilled trades, and employers

who require a high school diploma wel-come those who demonstrate proficiencythrough the successful passage of theGED tests. Clearly, the GED tests and oth-ers like them have an important role toplay in this country.

We believe that some of the problemssurrounding the standardized tests usedin adult basic education programs, suchas the GED and the TABE, are not withthe tests themselves, but with learners’test-taking anxiety and lack of test-takingskills. These two factors are interrelated;knowing more about standardized testsand how to take them can boost a learn-er’s self-confidence and reduce her test-taking anxiety. However, people are notborn with test-taking skills, and some-times learners from other countries havehad very little exposure to American-styletests with multiple-choice items and sepa-rate answer sheets, or with the computer-administered tests that are becomingpopular.

PART TWO: DOING YOUR BESTON STANDARDIZED TESTS

At this point, we would like to offersix very practical suggestions to helpadult learners perform to their best abilityon standardized tests.

1. Get positive about taking tests!Adult learners need to think positively

about themselves, the learning they aredoing, and the tests they will be takingto assess their learning. While standard-ized tests can be daunting, they also offeradult learners a way to move up, to pro-vide a role model for their children, toget a better job, or to go to college. Alltoo often, adults without a basic educa-tion see themselves as victims. A positive

A D V E N T U R E S I N A S S E S S M E N TPA G E 6V O L U M E 1 6

Page 9: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

attitude can boost confidence andimprove test performance.

Researchers have found that test per-formance is, in part, psychological. Whenlearners receive positive messages abouttheir ability to learn and to succeed aca-demically, they are less likely to conformto stereotypes that they believe othershave of them, and they perform signifi-cantly better on tests. So, adult learnersand their teachers must be positive!

Adult learners need to see testing as an opportunity to demonstrate theirability, not evidence that they are victimsof a system that cares little about them.Doctors often tell their patients to bepositive, because research has shownthat patients who remain positive livelonger and avoid illnesses better thanthose who do not. The same is true foradult learners when taking tests—be pos-itive and you’ll perform at a higher level.

2. Clear the brain for learning and testing!Many adult learners lead stressful

lives. Stress comes from family, from thejob, from personal health concerns, fromthe times we live in, and so on. But ifadult learners want to improve their livesand those of their family members, theyneed to find time to concentrate on learn-ing. Adult learners need to have somequiet time each week to study, and regu-larity and consistency make learning easi-er. They must see this “learning time” assomething they deserve. The study placeshould be quiet to allow for concentra-tion—perhaps the local library on a Saturday morning, or a quiet place athome in the early morning or late night if necessary, and should be dedicated to studying, with books, paper, and pen

readily available. Learners need to stayorganized because this time is precious,and they owe it to themselves to makethe most of it.

Adult learners also need some quiettime right before taking a test. An hour ortwo to clear their heads of life’s stresses,away from family, away from the job: timeto think about the challenges associatedwith the upcoming test. An adult learnerwho arrives late for a test, huffing andpuffing, upset about a family- or job-relat-ed problem is not emotionally ready forthe challenges of a test. If failure follows,the test is often blamed, but the realproblem might be that the adult learnerwas not psychologically ready to performto her capabilities. If prior test-takingexperience resulted in failure, the adultlearner should strive to put that behindher and focus on the present test and herefforts to perform well on it.

3. Prepare for the test “strategically”!We were talking with a colleague the

other day who told us about an adultlearner who persisted in studying for onesection of a GED test that he thought hewas weak in. He had failed the test sever-al times previously, yet this one sectionwas only 10% of the test. This learnerwould have been much wiser to considerthe content coverage of the test (whichwas information readily available to him)and to plan his study time accordingly.

There are two key strategies forpreparing to take a standardized test. The first strategy is to become familiarwith the format of the test: What sorts ofquestions are asked, how is informationconveyed, and how are answers loggedin? This knowledge will reduce the levelof surprise and confusion that robs the

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

A D V E N T U R E S I N A S S E S S M E N T PA G E 7 V O L U M E 1 6

Page 10: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

test taker of time she could be using toanswer questions. The second strategy isto research the content coverage of a testand then to apply the study time thelearner has available on the content thatwill count the most.

With most standardized tests such asthe TABE and GED, the format and contentinformation is readily available. Let’s takethe GED as an example. It is based on ahigh school curriculum and performancestandards that are used throughout thecountry. The five tests are in a multiple-choice format (except for one essay), andhave been developed by experts familiarwith secondary and adult education. TheLanguage Arts Test emphasizes organizingtext and the mechanics of writing. TheMathematics Test includes computationalproblems and real world problems andapplications. The test will give you anyformulas you will need to use. Calculatorsare used with one of the sections. Somemath answers are multiple choice, butmany are marked on little “bubblecharts.” The Social Studies Test drawscontent from United States and WorldHistory, Government, Economics, andGeography. That test contains at least oneexcerpt from a major historical document,such as the Declaration of Independence.The Reading Test will have the adultlearner read and interpret many differentforms and varieties of literature, such asfiction, nonfiction, prose, poetry, anddrama from different cultures and timeperiods, as well as use business-relateddocuments. The Science Test has the testtaker interpret and use scientific i n f o r m a-tion in the form of text or graphics, andmaterial from the life sciences or physicalsciences. Adult learners might be asked to interpret experimental results or explain

how results from a classic study apply tothe everyday world. Even more detail onspecific GED tests is readily available fromthe GED testing service and in book-stores.

Adult learners who want to studystrategically can use information like thatprovided above to orient themselves totests and focus their study time for maxi-mum results. They can easily find outwhat content is covered by a particulartest and how much importance will begiven to various topics; for example,geometry makes up a small part of theGED math test. With this kind of informa-tion, learners can focus their study timeon the most important topics, and whenthose topics have been mastered, theycan move to the less-important ones. Inaddition, knowing what the most impor-tant content areas are can help learnersfind the right study aids.

4. Become familiar with test-taking techniques!Going into a test with a good knowl-

edge of basic test-taking techniques willhelp a learner to do his best. Much hasbeen written on good techniques; here isa sampling of the most often repeatedadvice:

• Listen carefully to directions. One of the most critical rules for adultlearners is to listen carefully to thetest directions: How much time isavailable? How will the test be scored?What advice, if any, is given aboutwhen to randomly guess on multiple-choice test questions? Does the testadministrator have any special instruc-tions? Knowing available time allowsadult learners to apportion their time

A D V E N T U R E S I N A S S E S S M E N TPA G E 8V O L U M E 1 6

Page 11: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

so that they don’t need to rush to fin-ish at the end. Knowing about scoringalso helps with time use: if 50% ofthe score will be assigned to essays,then test takers should devote 50% oftheir test time to writing the essays.And as for whether to guess on multi-ple-choice test questions, the answerdepends on how the test items arescored. If there is no penalty forwrong answers, learners would besmart to answer all questions, sowhen time is about to run out, theyshould randomly guess at any remain-ing answers prior to handing in theiranswer sheets. On the other hand, ifthere is a small penalty for wronganswers, learners should be encour-aged to answer if they can eliminateat least one of the answer choices.Otherwise, guessing has no particularadvantage. Concerning special instruc-tions, adult learners must rememberto listen carefully: the instructionsmight include information about themost important questions on the test,whether or not calculators can beused, the desirable length of essayquestions, and so on.

• Scan the test before starting to answer questions.Adult learners must remember to scanthe test first to get an idea of lengthand difficulty. If the test is made up of multiple-choice questions, theyshould work on the questions in orderand not spend too much time on anyone question. Skipping around thetest and doing a question here andthere is not a good strategy becausevaluable time is wasted and mightlead to errors in marking the answer

sheet. If essay questions are part of the test, however, it makes senseto scan these questions and do theeasier ones first.

• Understand a question beforeanswering it. With multiple-choice questions, adultlearners must read the questions care-fully prior to answering. One of themost common mistakes is not answer-ing the question that is actually beingasked. Negative words in the “ques-tion stem” can be especially confus-ing. Sometimes words are highlightedin the question stem and these tooare important clues. When in doubt,adult learners should eliminate choicesthat they know to be wrong, and thenchoose an answer, at random if neces-sary, from the remaining choices. Theirpartial knowledge will be rewardedwith such a test-taking strategy.

• Review the choices.Here are a few additional tips for mul-tiple-choice questions: (1) Read thequestion stem, try to think of ananswer, and then look for it amongthe available answer choices. If thatdoesn’t work, at least eliminate thechoices that appear to be wrong priorto guessing an answer. (2) If theanswer choices are numbers or dates,middle choices are often correct. Notealso that longer answers and/or moregeneral answers among the answerchoices are more likely to be correct. (3) Sometimes test takers are given a choice among essay questions. Adult learners should be encouragedto watch for this option. Sadly, manytest takers fail to heed directions such

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

A D V E N T U R E S I N A S S E S S M E N T PA G E 9 V O L U M E 1 6

Page 12: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

as, “Answer one of the three ques-tions below” and try to answer allthree instead, thus scoring lower thanthey could have.

• Be flexible in approaching essayquestions.With short answer and essay ques-tions, adult learners should be encour-aged to try to write at least something,e ven if it’s just a few sentences. Of t e npartial marks are assigned, so even apartial answer will generate somepoints. Be f o re starting to write theire s s a y, adult learners should try to p re p a re an outline. Paraphrasing thequestion itself is often a great way tostart an essay. Clear writing, along withgood grammar and spelling, are typi-cally important in the way essays ares c o red. Adult learners should there f o reremember to re v i ew their writtena n s wers for the use of good sentences t r u c t u re, grammar, and spelling.

• Review your work.It’s important to remember to reviewyour answers and essays. We all tendto breathe a sigh of relief when thelast question has been completed, butadult learners who leave a test withtime still available are missing anopportunity to improve their scores.The test is not over until the time isup, or at least until every answer hasbeen checked and essays have beenreviewed for grammar and spelling.

• Stay as calm as you can. Above all, adult learners should staycalm and simply do the best job theycan with the time available. Stayingcalm will make you more efficientwhile you are answering.

5. Take a practice test—or even better,take several practice tests!No one learns to fly a plane, drive a

car, swim, or play golf just by readinghow-to books. Practice makes perfect, asthe saying goes, and testing is no excep-tion. There are lots of practice tests avail-able for the GED; in fact, bookstores arefull of books containing practice tests formost national standardized tests.However, adult learners need to takethese tests under test-like conditions, and that means with the time limit thatwill be in place when the test counts.They need to be exposed to some of thenatural anxiety that arises when seeingfirsthand the test and test question for-mats. They need to practice their pacing,practice reading the questions andanswering them carefully, practice makingjudgments about when and how to guess,and so on. Of course, these practice testscan be scored, so both weak and strongknowledge and skill areas can be identi-fied. In a sense, every test, whether it isintended for practice or not, providesexperience that can help one perform bet-ter on future tests. Adult learners canmull over their performance and how theymight do better the next time—by beingbetter rested, being more prepared on thecontent area, making improved use ofavailable time, and so on.

6. Read, read, read!Studies have shown that vocabulary

is one of the most important factors indoing well on standardized tests. Everytime a test taker encounters a word hedoesn’t know, he is less likely to under-stand a reading passage or a question. Itsounds overly simple, but the fact is thatvocabulary development is critical to suc-cess in all subject areas. The best way to

A D V E N T U R E S I N A S S E S S M E N TPA G E 1 0V O L U M E 1 6

Page 13: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

build vocabulary is by reading, reading,and then more reading. Reading showswords in context—that is, how they arereally used in sentences to make mean-ing—and that’s the best way to learnthem. Adult learners should read in theirspare time, read on the bus to work, andread before going to bed...and should tryto read for understanding.

SummaryIn this article we have tried to give

a good overview of standardized testingand provide practical suggestions forhelping adult learners demonstrate theirknowledge and skills on these tests. Ourhope is that when learners are equippedwith basic knowledge about these testsand proven test-taking approaches, theywill be able to demonstrate what they aretruly capable of.

Ronald K. Hambleton holds the title ofDistinguished University Professor and isChairperson of the Research and EvaluationMethods Program and Co-Director of theCenter for Educational Assessment at theUniversity of Massachusetts Amherst.Professor Hambleton has been teachinggraduate-level courses at UMass since1969. Stephen Jirka is a doctoral candidatein the Research and Evaluation MethodsProgram at UMass Amherst. His currentresearch includes external validation of testscores and standard setting methods.

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 1 V O L U M E 1 6

Page 14: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

H O W T O D O Y O U R B E S T O N S T A N D A R D I Z E D T E S T S : S O M E S U G G E S T I O N S F O R A D U L T L E A R N E R S

A D V E N T U R E S I N A S S E S S M E N TPA G E 1 2V O L U M E 1 6

Page 15: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

Using the REEP Assessment for ESOL and ABE Class room Inst r u c t i o n

B Y J O A N N E

P I N S O N N E A U L T

A N D C A R E Y R E I D

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 3

Using the REEP for ESOL Instruction:Joanne Pinsonneault

The REEP is the mandatoryassessment in Massachusetts for measuring adult ESOL learnergains in the area of writing. Afteradministering the REEP twice

to my class of mid-level ESOL learners, I discovered the test—with its warm-upactivities, familiar essay prompts, andsimple scoring rubric—to be very learnerfriendly. As long as I did not use officialtesting prompts in my classroom, therewas no reason why I could not share use-ful aspects of it with my learners. Afterbeginning where I usually do (by talkingto my colleagues), I eventually wrote a lesson plan. My initial objectives wereto familiarize the learners with the REEPWriting Assessment’s pre-writing activities,essay prompts, and scoring rubric as oneway to help me determine the bestcourse for teaching writing. I decided towalk them through the testing procedure,using my own pre-writing questions andwriting prompts and discussing the content and purpose of each step as theyactually completed it. I began with therubric because I wanted them to under-stand how their writing was being judged.

Step 1: Helping learners to understandthe REEP testing process

I gave each learner a copy of theREEP rubric. I then explained the essay-scoring process: Two teachers who do not know the learner read and score the

essay by using the rubric. If the scores arewithin a one-point range, they are aver-aged. If the range is greater than onepoint, a third reader provides a scorethat must be within one point of either of the first readers’. Those two scores arethen averaged. I demonstrated how theaveraging system works.

Next, I wrote the five scoring cate-gories and my explanations of them on the board:

1. Content and Vocabulary (addressestask, answers questions, varies vocabulary)

2. Organization and Structure (paragraphformation, details, essay structure)

3 . St r u c t u re (sentence structure, grammar)

4. Mechanics (punctuation, capitalization,spelling)

5 . Voice (addresses audience, engages/persuades re a d e r, embellishes if neces-sary to make intere s t i n g )

We discussed each category in general,and I described the expectations for eachcategory at every level on the point scale.

Step 2: Administering a practice testI asked the learners to recall the

testing pro c e d u re: introduction, brain-storming, conversation, prompt re a d i n g ,and writing. I explained standard i z a t i o nand the rules for administering this test.Now the learners we re ready to take a practice test.

V O L U M E 1 6

Joanne Pinsonneault

and Carey Reid are

veteran teachers who

have found good ways

to use the REEP

writing assessment

for instructional

purposes.

t

Page 16: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

Step 3: BrainstormingWe discussed the purpose of brain-

storming (to get ideas flowing), and,keeping within the five-minute time limitthat the REEP allows, practiced brain-storm listing with a prompt that I created:What kinds of challenges do new immi-grants encounter when they first come toAmerica? Afterward, I asked the learnerswhat they had learned from the process.At first, many had a hard time differenti-ating between the items on the boardand the process that led to the discoveryof those items. Then they began thinkingabout personalizing the list, discardingirrelevant information, and grabbing ontoothers’ ideas. I modeled how they couldcreate their own lists in their minds byadopting or discarding other learners’ideas.

Step 4: Conversation practiceI began to wonder if the conversation-

al “warm-up” activity used as part of theofficial assessment could also be used forinstructional purposes. Back in the class-room, I explained to my learners that theconversation piece is another pre-writingactivity that can help them to generateideas for writing. As an intervention, Ipointed out that they should ask them-selves, “What did I learn from my part-ner?” not “What did I learn about mypartner?”

Keeping within the REEP’s ten-minutetime limit for this activity, they practicedwith the following questions keyed to the new prompt:

1. What is your name?

2. Where are you from?

3. When did you immigrate to America?

4. What was one challenge that you confronted when you first arrived in America?

5. How did you meet this challenge?

Considering the time limit, I suggestedthat they focus on the last two questions,because these will provide them withideas for writing. After ten minutes, Iasked: “What did you learn from yourpartner? Did she or he give you any good ideas for writing?” I reminded thelearners that this exercise does notrequire that they tell “the truth” abouttheir experience; they can create circum-stances and embellish to make theiressays more complex and interesting.Therefore, if they listen carefully as the teacher gathers comments from other learners, they might just get a few ideas to use in their essays.

Step 5: Understanding the promptTo successfully address any writing

task, the learners must decide who theiraudience is, what the topic is, and whatquestions have been asked. I dividedthem into small groups and asked themto consider three variations on the sametheme (see Figure 1).

In their groups they discussed the differences among the prompts andanswered the following questions: “Whomam I writing this for?”; “Whom/what am Iwriting about?”; and “How many ques-tions do I have to answer?”

Afterwards, they returned to the wholegroup for a discussion of the differencesamong the prompts. I asked the learnershow they could use the pre-writing activi-ties to their advantage in responding toeach prompt. All of sudden it made senseto them! The comments were overwhelm-

A D V E N T U R E S I N A S S E S S M E N TPA G E 1 4V O L U M E 1 6

Page 17: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

ing. “My transition was so easy, but Icould use what Mayra said about her lifeto help me answer the third prompt.” Byreferring back to the rubric, we couldpoint to how these pre-writing activitiesmight lead to higher levels of writingquality.

Step 6: Administering the promptFinally, the learners took the practice

REEP, under much the same circum-stances as an official program assess-ment. All in all, these first six steps tookabout three hours, or one class session.But the work had just begun!

Step 7: ScoringI scored the essays myself and, using

a simple form, highlighted the strengthsof each and offered concrete suggestionsfor improvement based on skills that Iknew the learner had been working onand had demonstrated some ability with.(See Figures 2 & 3.) For example, if thelearner had previously demonstrated theability to write complex sentences butdidn’t do so throughout the essay, I sug-gested places where simple sentencescould have easily and effectively beencombined. During learner-teacher writingconferences, we discussed the scores andcomments. I also gave the learners theopportunity to rewrite their essays, a standard option in our class.

ConclusionsI felt that I had achieved my primary

objectives for the lesson: The learnersgain increased awareness of the stan-dards by which their writing is judged,and they had some initial ideas abouthow they can improve their writing. Buthow could I use this experience to inform

my writing instruction? I’m still workingon this, but I did spend some time look-ing at the scores of the class overall, andfound a few surprises. Some of my workhad been effective. For example, themajority of learners demonstrated goodcontrol over the verb tenses we had beenstudying! However, many learners werehaving difficulty with paragraph formationand needed a review of the punctuationrules. (Try again, Joanne!) I felt like I hada new place to begin, both with eachlearner individually and with the class in general.

I have also noted a few unexpectedresults, including reduced test anxiety andimproved REEP scores. But best of all, mylearners feel like they have more controlover meeting their mandatory “improvewriting skills” goal. Teaching from theREEP has been very empowering.

Using the REEP for ABE Instruction:Carey Reid

I have been co-teaching a pre-GEDwriting class at the Jamaica Plain (Boston) Adult Learning Program oneevening a week with the class’s primaryteacher, David Stearns. David and I havebeen experimenting with an authenticmaterials curriculum, using Boston Globearticles, Net-based research, and evenresearch reports from Focus on Basics asstarting points for activities that requireour learners to summarize, analyze, andrespond in writing to these sorts of chal-lenging materials. We’ve been delightedat how willing our learners are to plungeinto this stuff and make sense of it, evenwhen a good understanding will mostoften require at least three readings.

When I learned that Joanne

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 5 V O L U M E 1 6

Page 18: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

Pinsonneault was using the REEP in hermid-level ESOL class, it occurred to methat the learners in David’s class might beable to handle it. David was game to tryit, and so were the learners. We decidedto use the REEP materials as instrumentsfor periodic formative assessment. Here’show we did it.

Stage One: Unpacking the rubricWe gave each learner a copy of the

scoring rubric and spent about 90 min-utes going through it in a whole groupdiscussion. David and I explained that therubric taken as a whole represented mostof the elements that make up writing. Weexplained further that the rubric as anassessment tool provided good indicatorsfor levels of quality and sophistication. At that point, we started to just cruisethrough it, pointing out indicators at ran-dom and giving oral examples of howthese indicators might align with a partic-ular piece of writing (e.g., “If the writerprovides a lot of supportive detail andcan use complex sentences, then she’swriting at such-and-such a level.”) Afterthis exploratory phase, we asked thelearners if they were willing to look atsome sample essays and try their hand at scoring them. I am not exaggeratingwhen I report that the learners were veryintrigued by the rubric and equally eagerto try applying it to actual writing.

Stage Two: Scoring the anchor essaysin small groups

David and I really kept our fingerscrossed with the next stage. Could thelearners apply the rubric to actual writingwith some accuracy, or would theybecome frustrated by such a task? Wereminded ourselves that one of the rea-sons we decided to use challenging mate-

rials and exercises in our class was ourdetermination to put into actual practiceideas and theories we professed tobelieve in—e.g., that adults learn best by doing, that they know what they want,and that they have acquired tons ofknowledge through their life experiences.We did not want to appear to be the har-bingers of special knowledge (“We knowall the grammar rules, but you don’t”), or superior in position (“The REEP is forteachers, but not for learners”), or uncon-sciously condescending by coddling or“protecting” our learners (“Concepts andmaterials from standardized testing aremuch too complicated for you to under-stand”). In fact, in our class-planning ses-sions, David and I would call each otheron perceived instances of these kinds ofpresumptions.

For this stage, we decided to use theessays currently used in the training ofMassachusetts teachers for REEP scoring.As in that training, we decided to use thesix “anchor essays,” each of which repre-sents the same score across all levels—for example, a “3” all the way across.Specifically, the six anchor essays repre-sent scores from 1 through 6. We told thelearners that they would be looking forthat consistent score level for each essay,explaining that this would simplify theexercise a bit, adding that this was thesame process used to train Massachusettsteachers to understand the REEP.

David and I then organized the classinto three workgroups of five learnerseach. We gave each learner a packet ofanchor essays and a blank scoring sheetand asked that they read a particularessay together, score it together using therubric, and try to reach agreement on ascore. For some time, we’d been usingcollaborative learning approaches, so the

A D V E N T U R E S I N A S S E S S M E N TPA G E 1 6V O L U M E 1 6

Page 19: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

learners stepped right into this task. Oneby one, the workgroups read and scoreda particular essay, and then when allthree groups were finished David and Iasked the facilitators to report the scores.Then we discussed the scores and thereasons behind them.

Frankly, David and I were surprised bythe results of this exercise. In nearlyevery case, the workgroups gave thesame score to a given essay as the REEP“experts” assigned to it. In those caseswhere their and the experts’ scores weredifferent, the variance was only a singlepoint. In addition, the three workgroupsnearly always agreed on the same score,and when they varied it was only by asingle point. In full class discussions, thereasons for giving scores were discussed,and where scores differed the workgroupsdefended their choices. Occasionally, anindividual learner would disagree with theother members of her group; in the fullclass discussions, she would get a chanceto defend her choice. (A great anecdoteto share: One time a very shy learnerrevealed that she did not agree with the score the other members of her group had given. At that point, the othermembers asked her to speak up anddefend her score. As it happened, herscore matched that of the experts.Needless to say, the shy member’sstanding in her group went up a notch as a result of that episode.)

Stage Three: Using the rubric to scorelearner essays

The class explorations of the REEPreally intrigued the learners. When Davidand I said we hoped to apply the rubricto their own writing, they were complete-ly supportive of the idea. At this point,David and I began to realize that because

the REEP is not used as an official learn-ing gains assessment in Massachusetts fornon-ESOL learners, such as those in ourclass, we were free to adjust the materialsand approaches as we felt we needed to.For example, we could alter the rubric ifwe wished, to make it more appropriatefor our class. We were also free to designprompts that we felt suited our class bet-ter than those that are more appropriatefor ESOL learners, which are based, forexample, on personal letters or narratives.It struck us that it might be better for ourlearners to use prompts that resembledthose used for the GED Writing Test.

The first prompt we developed basicallyc e n t e red on the question, What is the mosti m p o rtant profession in modern society andw h y ? To get the ball rolling, we startedwith an open discussion and some brainstorming, as Joanne Pinsonneault had done. Then we asked learners to write original essays for 30-45 minutesin response to a prompt based on thebrainstorm question. (See Fi g u re 5.)

David and I then read over these essaysand, again as Joanne had done, score dthem and noted those scores on a single-learner score sheet. We decided to list foreach learner only a few Notable St re n g t h sand one A rea for Im p rovement. Our ideawas to provide the learners with a Ne x tStep for improving the draft, based onideas about process writing we’d beenapplying all along. At the next class, Da v i dor I sat down with each learner and we n tover the scoring and the notes we’d madet o g e t h e r. (We’ve provided an example of anessay and score sheet as Fi g u res 4 & 5.)

Stage Four: Using the essays as“authentic materials”

At this point, everything we weredoing with the REEP and with our overall

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 7 V O L U M E 1 6

Page 20: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

authentic materials approach cametogether in a nice neat package: Davidand I decided to use the learners’ essaysas source materials, as the stuff thatfuture lessons could be based on. Forexample, we gleaned fifteen sentenceswith errors from the various essays, listedthem on a single sheet, and then askedthe learners to discuss and edit themtogether in workgroups. The learners weremuch more motivated to edit samples oftheir own work than materials from work-books.

Next steps: Using the rubric morewidely

David and I are just beginning toexperiment with using the rubric as aself-assessment and peer-assessmenttool. For example, the class is now knee-deep in a several-week project in whichwe’re applying the NCSALL research onlearner persistence to the learners’ ownlives. We began by reading and dis-cussing together the research findings(Focus on Basics, Volume 4, Issue A). Thelearners are now writing essays composedof an initial summary of the research, acommentary on how the findings apply ordo not apply to their own lives, and afinal section on ways in which our ownwriting class might be altered to bettersupport their persistence(!). Now thatmost of the learners have a first draft, wewill be asking them to use the rubric toself-assess the piece and, perhaps in con-ference with one of us, plan out the sec-ond draft.

David and I are also just beginning totry to integrate the rubric with a peer-editing process that the class is develop-ing together. We’ve come up with rules

and guidelines around constructivecriticism; now we’re wondering if therubric would be a good tool for promot-ing objective and constructive critiquing. We shall see!

A final note...David, Joanne, and I have presented

these ideas in conferences and work-shops over the past year. A common concern among participants is that theREEP scoring rubric might be set too highfor new readers and beginning ESOLlearners. Gradually, however, we’ve allcome to realize that teachers are free toadapt these materials for their own learn-ers. They can simplify the indicators,make them more positive sounding,reduce the number of levels, and soforth.

We would like to offer one caveat,however. Using the REEP rubric has, inour opinion, substantiated the claimamong many theorists and experiencedpractitioners that adult learners knowa lot more about writing than might bereadily discernible. The learners oftenattach, constructivist-fashion, their working knowledge of writing elements to rubric terminology: in a sense, they are enabled to give voice to what theyalready know. So, before simplifying oradapting the rubric, you might first wantto find out if you really need to.

Joanne Pinsonneault is an ESOL teacherwith the UMass Dartmouth WorkersEducation Program in New Bedford,Massachusetts. Carey Reid is StaffDevelopment Specialist for Licensure andAssessment with the System for Adult BasicEducation Support in Massachusetts.

A D V E N T U R E S I N A S S E S S M E N TPA G E 1 8V O L U M E 1 6

Page 21: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N T PA G E 1 9 V O L U M E 1 6

F I G U R E 1

J o a n n e’s Th ree Pra c t i ce Pro m pts

[In an actual writing situation, each would be printed on a separate piece of paper with lines provided for the learner’s writing.]

REEP Practice: Immigrating to America

Question Number 1

Your friend writes, “I am moving to America and I need your help. What kinds of challenges and problems can I expect to encounter when I first arrive? Where can I go to get help? Do you have any advice or suggestions for me? Thank you for your help!”

Instructions: Write a letter to your friend. Be sure to answer his/her questions. You have 30 minutes. You may NOT use a dictionary or talk to other learners during the test. Do not write on the back of the paper. You may use extra paper. Check your letter before you hand it in.

Question Number 2

Think about how difficult it is to immigrate to a new country. What challenges confront an immigrant family after moving to America? What resources are available to immigrants once they arrive here? Who can immigrants turn to when they need assistance?

Instructions: Write an essay about the challenges confronting new immigrants to America. Be sure to answer the questions. You have 30 minutes. You may NOT use a dictionary or talk to other learners during the test. Do not write on the back of the paper. You may use extra paper. Check your essay before you hand it in.

Question Number 3

When did you immigrate to America? Where did you come from? What challenges did your family confront after moving to America? What resources did you use to help you confront these challenges? How long did it take for your family to adjust to life in America?

Instructions: Write an essay about the challenges you confronted when you immigrated to America. Be sure to answer the questions. You have 30 minutes. You may NOT use a dictionary or talk to other learners during the test. Do not write on the back of the paper. You may use extra paper.Check your essay before you hand it in.

Page 22: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N TPA G E 2 0V O L U M E 1 6

F I G U R E 2

An Example of a Lea r n e r’s Res p o n se to a Pra c t i ce Pro m pt

Page 23: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N T PA G E 2 1 V O L U M E 1 6

F I G U R E 2 ( P A G E 2 )

An Example of a Lea r n e r’s Res p o n se to a Pra c t i ce Pro m pt

Page 24: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N TPA G E 2 2V O L U M E 1 6

F I G U R E 3

J o a n n e’s Scoring and Co m m e n ta ry Sheet for the Lea r n e r’s Res p o n se

Page 25: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N T PA G E 2 3 V O L U M E 1 6

F I G U R E 4

An Example of a Lea r n e r’s Res p o n se to One of David and Ca rey ’s Pro m pts

Page 26: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G T H E R E E P A S S E S S M E N T F O R E S O L A N D A B E C L A S S R O O M I N S T R U C T I O N

A D V E N T U R E S I N A S S E S S M E N TPA G E 2 4V O L U M E 1 6

F I G U R E 5

D avid and Ca rey ’s Sco re and Co m m e n ta ry Sheet for that Lea r n e r’s Res p o n se

Page 27: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A D V E N T U R E S I N A S S E S S M E N T PA G E 2 5 V O L U M E 1 6

The staff at The Center

for New Americans in

Western Massachusetts

take the general goals

that newly enrolled

learners set at intake

and use them in rich

and constructive ways

in the classroom.

I n te g rating Goal Setting into Instructional Pra c t i ce

B Y T H E S T A F F A T T H EC E N T E R F O R N E WA M E R I C A N S

goals directly to classroom activities.Connecting learners’ goals and classinstruction helps teachers meet students’needs and allows students to experiencesuccess in meeting their goals, which con-tributes to their increased motivation andpersistence.

Goal-setting activities in the class-room

The general goal that a studentreports at intake is recorded on theStudent Goals Form, which is passed onto the teacher. During the first two weeksof a new class session or tutorial, theteacher presents activities that help thestudent break the general goal into small-er steps, or mini-goals. In accomplishingthe mini-goals, students experience suc-cess and personal satisfaction. Mini-goalsalso help students to realize the amountof time needed to achieve their largergoal. These activities are the heart of asuccessful class and must be viewed aspart of instructional time, not separatefrom the curriculum. In addition, they arecrucial to the development of our learnersas co-negotiators of the curriculum.

By the end of the second week, stu-dents are able to outline several smaller,more specific steps toward their largergoal. In class, each student thinks aboutwhat he or she wants to study the follow-ing week and reports to the group. Thesestudent requests become the basis of thecurriculum for the week ahead. The

ur program offers ESOL class-es from the beginning toadvanced levels to adultimmigrants and refugees in

Franklin and Hampshire counties inWestern Massachusetts. In this articlewe’d like to share our approach to learnergoal setting in the hope that other pro-grams will find it useful.

Goal setting at intakeDuring the intake interview,

Massachusetts adult education programsask learners about their reasons for want-ing to study. Among other things, thisactivity supports a Department ofEducation reporting requirement. At thisstage, students usually have only a broadidea of what they want to accomplish.Examples of goals frequently mentionedat intake interviews are: “to speakEnglish,” “get a job,” and “learn aboutthe U.S.” However, if students are to feelsuccessful and motivated in class, and topersist in adult basic education classes inthe face of the many obstacles in theirlives, we believe it is necessary to makethese general goals clearer and more spe-cific.

At The Center for New Americans, wehave developed a successful classroom-based approach that helps studentsexplore the goals they reported at theintake interview so that they becomemore specific, measurable, achievable,and realistic. Teachers then link these

o

Page 28: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

I N T E G R A T I N G G O A L S E T T I N G I N T O I N S T R U C T I O N A L P R A C T I C E

A D V E N T U R E S I N A S S E S S M E N TPA G E 2 6V O L U M E 1 6

teacher plans lessons, activities, andmaterials that respond to these requests.The teacher might include other elementsthat are needed based on learning assess-ments. At the end of each week, studentsreflect on their learning in their logs, andagain make requests for the followingweek—in essence, setting new mini-goals.The reflection time allows students to self-assess what they have learned and howwell they have learned it. Both reflectionand planning take time and are consid-ered part of instruction as well. Theteacher always responds to the newrequests the following week, and assessespast lessons by observation, evaluation ofperformance tasks, quizzes, and other for-mal or informal modalities. This processprogresses in a spiral, with the goalsdirectly informing instruction, followed byassessment by both students and teachersand the setting of new mini-goals as theclass continues. (See Figure 1.)

How does it really work?Let’s think about a beginning ESOL

class of ten learners. The primary goalsset by these students at intake were tocommunicate more effectively in English,get a job, learn about U.S. culture, andbecome a U.S. citizen. A goal-setting activity might begin by using pictures to teach the names of several places intown, including town offices and schools and other places used by the learners.Students could also draw pictures, and a list could be put up on the wall in theclassroom. Once these places were identi-fied and could be recognized, the teachercould ask the students if they needed orwanted to use English in these differentplaces. The teacher could ask each stu-

dent to prioritize which three or four heor she wants to focus on during the classcycle. A calendar is useful here to empha-size the finite amount of time availablefor a given topic. Students would thenwrite the selected places that interestedthem in their individual logs.

Among other introductory activities,the curriculum for the initial week of classmight include teaching students howto name places where they need or wantto communicate in English. The followingweek might include one specific place—for example, the doctor’s office—wherea dialog about calling for an appointmentwould be studied, practiced, and role-played; a TPR (Total Physical Response)activity might be performed to help students learn what doctors and nursesmight say; students might study a vocab-ulary lesson about the people and thingsat the doctor’s office—ailments and symptoms, for example. The curriculumdepends on what needs the studentshave expressed.

At week’s end, each student wouldreflect on their learning and discuss whatthey want to study the following week.The teacher might need to narrow thefocus of the requests by asking ques-tions. These ten student requests wouldbecome the basis for the teacher’s les-sons in the upcoming week.

In our next example, students of anadvanced ESOL class have set similargoals at intake: to communicate moreeffectively in English, get a better job,and learn about U.S. culture. An activitythat helps these advanced studentsdevelop mini-goals involves their breakinginto small groups and making lists ofwhat they can do in English. A follow-up

Page 29: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

I N T E G R A T I N G G O A L S E T T I N G I N T O I N S T R U C T I O N A L P R A C T I C E

A D V E N T U R E S I N A S S E S S M E N T PA G E 2 7 V O L U M E 1 6

activity is to have them develop a list of what they want to be able to do in English. These lists can be put up on the classroom wall to help the students remember what has been discussed.

One item on the second list might be“to speak fluently”—a very broad andlong-term goal. Students can work togeth-er to explain their reasons for selectingthis goal, and through this process, theirindividual needs will become clearer. Forexample, one student’s goal of speakingfluently might actually mean being under-stood by Americans. For another, thissame goal might mean being able toexpress feelings in English. By probing,the teacher might discover that for thefirst student pronunciation of specificsounds is difficult and for the other alack of specific vocabulary or culturalappropriateness is the area of concern.Calendars or timelines work well withthese students to break the goals downinto manageable steps. These mini-goals,different for each individual, should berecorded in the students’ learning logs.

The teacher again collects therequests and plans the lessons for thenext week based on the mini-goals devel-oped. At the end of the following week,students reflect individually on the activi-ties and on their learning and set newmini-goals, which become the basis foractivities once again. In each class, thechallenge is to marry all the requests ormini-goals in one week’s time, to respondto all of them to a reasonable extent.Once the students learn to reflect and are able to recognize what they still don’tknow or don’t know well enough, theybecome adept at making more specific

requests. By the end of the semester,another round of formal assessment isconducted and reported. Intake goals thathave been met are recorded at this time.

In conclusionWe believe this process of week-to-

week curriculum design is feasible for all ABE classes, even those with moredefined curricula such as GED. If the subject is essay writing, steps that needto be mastered can be identified and a timely plan set in motion. Students canchoose to write about topics that interestthem.

Elsa Auerbach writes that “Theessence of a participatory approach iscentering instruction around content thatis engaging to students.”1 As a staff, wehave found that responding to specificrequests from students for activitiesmakes lesson planning easier and increases students’ motivation and retention, and that students are morelikely to be engaged and active learners if the material is relevant to their lives.

This article was written by Nicole B. Graves,ESOL teacher (beginners and high intermediate levels) and ESOL ProgramCoordinator, and Peg Cahill, ESOL teacher(high beginner/low intermediate level) and Support Services Aide, with input from other members of the staff. The Center for New Americans teaches Englishto immigrants and refugees in Amherst,Greenfield, and Northampton, MA.

1 Auerbach, Elsa. Ways In: Finding

Student Themes in Making

Meaning, Making Change: A Guide

to Participatory Family Literacy

and Adult ESL. University

of Massachusetts, Boston: 1990.

Page 30: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

I N T E G R A T I N G G O A L S E T T I N G I N T O I N S T R U C T I O N A L P R A C T I C E

A D V E N T U R E S I N A S S E S S M E N TPA G E 2 8V O L U M E 1 6

InitialAssessment

Reporting

Initial goal setting

at intake

Goal setting for the session

during class

Initial curriculum development

• Lesson planning• Teaching/activities• Students’ reflection

Assessment• Formal• Informal• On-going

Setting specific mini-goals for the

next week

On-going curriculumdevelopment

F I G U R E 1

CENTER FOR NEW AMERICA N SH ow Goal Setting Informs Class room Pra c t i ce

Page 31: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A D V E N T U R E S I N A S S E S S M E N T PA G E 2 9

Introduction

t’s nearly impossible to livein American society today without having to take some kind of standardized test. You have

to pass a test to get a driver’s license,get American citizenship, receive aGeneral Educational Development (GED) certificate, get into college, and be considered for certain kinds of jobs.Here in Massachusetts, our children’steachers have to pass state tests to belicensed, and the children themselveshave to pass the Grade 10 version of the Massachusetts ComprehensiveAssessment System test, or MCAS, to graduate from high school.

Why do we have to take all thesetests? Basically, because there is wide-spread agreement (but not completeagreement) that tests can tell if a personhas the knowledge or skills needed for a diploma, a certificate, a school classlevel, or a job. But it’s not just any testwe’re talking about here – it’s standard-ized tests. Standardized tests are usedbecause people feel that if you’re goingto judge someone’s abilities, you’d betteruse a means that’s reliable and fair, andstandardized tests are designed to bereliable and fair – though people mightdisagree about whether they succeed in those goals.

We will not debate that issue here;the purpose of this article is to equip

V O L U M E 1 6

April L. Zenisky, Lisa A.

Keller, and Stephen G.

Sireci have written

a companion piece to the

learner-friendly article

that opens this issue.

The authors’ intentions

are to help practitioners

understand key

concepts underlying

standardized testing

and to interpret test

scores effectively.

A Basic Primer for Understanding Sta n d a rd i zed Tests & Using Test Sco res

B Y A P R I L L . Z E N I S K Y ,L I S A A . K E L L E R , A N DS T E P H E N G . S I R E C I

readers with a basic understanding ofwhat goes into standardized test makingand what test scores purport to showabout learners’ skills and abilities. Wewelcome any constructive use of thisknowledge, whether it be better instruc-tion or better policies, but all constructiveuses start with accurate knowledge.

Meeting the reliability and validitycriteria

Federal policies now require thatstates prove that ABE funds result inlearner gains in reading, writing, languageacquisition, and math. In addition, theyrequire that states measure these gainswith valid and reliable tests. After months of reviewing many standardizedassessments and their respective align-ment with the Curriculum Frameworks,Massachusetts policymakers and educa-tion professionals have agreed to use the TABE for ABE Reading, Writing, andMath; the BEST for ESOL Speaking andListening; and the REEP for ESOL Writing.Scores in each of these tests are meantto represent what students know or cando in those areas. What does it meanwhen we say these tests are reliable and valid? Let’s take up each of theseconcepts in turn.

ReliabilityThe consistency of scores across

different administrations or scorers isknown as reliability. It is crucial that test

i

Page 32: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A B A S I C P R I M E R F O R U N D E R S T A N D I N G S T A N D A R D I Z E D T E S T S & U S I N G T E S T S C O R E S

A D V E N T U R E S I N A S S E S S M E N TPA G E 3 0V O L U M E 1 6

scores be adequately reliable in repre-senting a person’s knowledge and skills.Some level of error is always a factor intesting (more on this later) and testscores. If a person takes the same teston different days, we expect the resultsto be slightly different, but the more errorthere is in the test’s make-up, the moredifferent the two test scores are likely tobe. If the two test scores are very differ-ent, it is reasonable to conclude that thedifference is due to test error and thatthe scores do not really reflect what thetest taker knows and is able to do.

Inconsistencies in scoring tests mightalso undercut reliability. Some tests arecomposed of multiple-choice questions,while others require that the test takerconstruct a response, such as an essay.Scoring a multiple-choice question isstraightforward, because there is oneright answer; the answer provided iseither correct or incorrect. Therefore,regardless of who scores the test, thescore on that question will be the same.Essay-type questions, however, requirehuman judgment and are therefore moredifficult to score. If two people read thesame essay, it’s likely that each personwill give the essay a slightly differentscore. However, if the two scores given bythe two scorers, or “raters,” are very dif-ferent, then the score on that essay isnot very consistent, or reliable.

The measure of consistency betweenscorers is called inter-rater reliability.The closer the scores assigned to anessay by different raters, the higher theinter-rater reliability of that test. While itmight seem impossible to get differentraters to assign exactly the same score, it is possible to train raters so that theyall score in a very similar way. If this goal is accomplished, there can be

more confidence that the score assigned to the essay reflects the ability of the student.

ValidityHow do we know whether a test

measures the ability we are interested in?Even if a test is perfectly reliable and vir-tually error-free, how do we know if it ismeasuring the abilities we want it to andnot something else? This is the centralconcern of validity, and ultimatelyinvolves the kinds of judgments that canbe based on test scores.

Let’s consider a math test consistingonly of word problems. The test scorecould appropriately be used to indicatethe student’s ability to solve math prob-lems that require reading; that would bea valid use of the test score. However,using the test score as a representationof the student’s math ability in generalwould not be valid.

People who develop tests analyzethem in several ways to determine theappropriate (i.e., valid) use of test scores.Let’s review some of the issues consid-ered in determining the valid use of testscores:

• Do the questions on the test representthe entire subject matter about whichconclusions are to be drawn? Forinstance, if a test is designed tomeasure general arithmetic ability,there should be questions about addi-tion, subtraction, multiplication, anddivision. If there are no questionsabout division, the test does notmeasure the entire content of arithmetic, so the test score cannot be said to reflect general arithmeticability.

Interpreting Test Scores Rule 1

Reliability of a given test is associated with the consistency and stability of test scores. Look fortest reliability statistics tobe higher than 0.80. When reliability is high, teachersand students can havegreater confidence that students' scores on that test are an accurate reflection of how they would do if the test wasadministered over and over, or if different peoplescored the test.

Page 33: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

• Is the student required to demonstratethe skill that the test is intended tomeasure? Tests should be directly tar-geted to the skills measured and thatskill should affect test performance.For example, a test designed to meas-ure writing proficiency should ask testtakers to write something, and betterwriters should be shown to receivehigher scores.

• Are the test scores consistent with other indicators of the sameknowledge and skills? Suppose a student takes a test designed tomeasure writing ability. If the studentdoes well on writing assignments in class, then he or she should alsodo well on the writing test, so long as the type of writing on the test isconsistent with that done in class. On the other hand, students who do not perform well on writing assignments in class should not do as well on the test. The validity ofusing that test score as an indicationof the person’s ability is questionableif there is inconsistency between thescore and classroom performance.

Using test scoresBy itself, a test score is just a number.

Elsewhere in this issue, you’ll discoverhow teachers are finding ways to applyelements of goal setting and assessmentto classroom practice; our purpose here,however, is to provide readers with abasic understanding of standardized testsand scoring. When teachers, students,and others who use test scores are look-ing at a test score for a particular stu-dent, there are an additional few piecesof information they can use to make thatnumber mean something. In the next fewpages, some of these pieces of informa-

A B A S I C P R I M E R F O R U N D E R S T A N D I N G S T A N D A R D I Z E D T E S T S & U S I N G T E S T S C O R E S

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 1 V O L U M E 1 6

tion are explained to help you understandwhat test scores do and do not mean.

Test score scalesA score scale is the range of possible

scores on the test. Score scales come inall shapes and sizes. On the TABE, forexample, different students might getscores as divergent as 212 and 657. Incontrast, on the REEP, scores range onlyfrom 0 to 6. A student who takes theBEST, depending on his or her ability to comprehend and speak English, willscore from 0 to 65 or higher. Is a 212 on the TABE a “better” score than a 5.4on the REEP? Even though 212 is a biggernumber, these two scores come from teststhat are very different and are designed to test very different things. For this reason, comparing scores across differenttests is generally not a good idea.

Because scores from the REEP, theTABE, and the BEST a re all on diff e re n ts c o re scales, the number a person gets as a score on one of those tests hasmeaning for that test only. It might be con-fusing to have diff e rent score scales, butthe people who develop tests do this onpurpose to make sure that users do noti n t e r p ret scores on a particular test accord-ing to some other standard or yardstick.

For example, in the United States thescore scale of 0 to 100 is commonly usedin many classrooms, but people whomake standardized tests often avoid thatscore scale because many people wouldassume that such scores mean the samething they do in the classroom.Sometimes test developers work reallyhard to create a unique score scale: e.g.,on one test used in the United States foradmission to medical school, scores aregraded from J (the lowest score) to T (thehighest score)!

Interpreting Test Scores Rule 2:

The questions on the test

should be representative

of the skill or knowledge

being measured, and

the test should give

the student reasonable

opportunity to demon-

strate what he or she

knows about the subject.

For each individual

student, the test score

should provide a result

that is similar to the

results of other indicators

of performance that

measure the same thing.

Page 34: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A B A S I C P R I M E R F O R U N D E R S T A N D I N G S T A N D A R D I Z E D T E S T S & U S I N G T E S T S C O R E S

Error in test scoresAs we explained at the beginning

of this article, some error is always a factor in test score interpretation. In fact, tests simply cannot provide information that is 100% accurate. This might sound surprising, but this is true for many reasons; for example:

• The extent to which a student haslearned the breadth and depth of asubject will influence how she or heperforms on a test. On a reading test,for example, a student might do wellwith questions about word meaningand finding the main idea of a pas-sage but have had less practice distinguishing fact from opinion. Theexperience (or lack thereof ) that a test-taker brings to the test repre-sents a source of error in terms of using the test score to generalizeabout the student’s reading ability.

• Sometimes a student taking a test isjust plain unlucky. If a student is tired,hungry, nervous, or too warm, he orshe might do worse on the test than ifthe circumstances were different.

• A test might have questions that seemtricky or confusing. If a student is not clear about the meaning of a question, he or she will havetrouble finding the correct answer.

• As we mentioned earlier, mistakes may be made in scoring a test. When students are not given credit for correct answers or are given creditfor incorrect answers, score accuracysuffers.

A D V E N T U R E S I N A S S E S S M E N TPA G E 3 2V O L U M E 1 6

Standard error of measurement The score a person gets on the test

is meant to indicate how well that personknows the information being tested. Oneway of looking at a test score is to thinkof it as consisting of two parts. One partrepresents the real but unknowable true ability of a person. This part isunknowable because it is never possibleto get inside someone’s head and havea perfect measure of their ability in thearea of interest. The other part of a testscore represents the error, all the thingsthat make the test a less-than-perfectsnapshot of someone’s knowledge at one moment in time. Unlike the waywe can manufacture a yardstick that is exactly three feet long to measurelength, even the best tests can providescores that are only approximations of the true ability.

Unfortunately, it is impossible to breakthese two pieces of a test score (the true ability and error) apart. But it isimportant to understand that any testscore contains a certain amount of error,and as we’ve illustrated the error mightbe due to things that are going on withthe test taker or things that involvehow the test is created or scored. Errorsin test scores cannot be completely eliminated, but fortunately there aretechniques that can be used to providesome idea about how much the scoreis affected by error.

For example, testing specialists cancalculate the standard error of measure-ment, which can be thought of as therange of scores obtained by the sameperson taking the same test at differenttimes. The standard error of measurement

Interpreting Test Scores Rule 3:

Find out what the range

of possible scores is

for the test you are using.

Knowing how high and

how low scores can be

for a particular test is

important to understand-

ing students' scores.

Page 35: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

is a “best guess” about how close thetest is to measuring a person’s knowledgeor skill with 100% accuracy. The standarderror of measurement is a statistical estimate of how far off the true scorethe test score is likely to be.

Let’s take the TABE as an example.Suppose a student takes the TABEReading Test, Level 7E and gets a scoreof 447. First of all, that score isn’t verylow or very high. The next piece of infor-mation that will be helpful in understand-ing this TABE score is the standard errorof measurement. The statistics of testdevelopment have shown that the standard error of measurement associatedwith 447 is 17 points, which means thatthe student’s true score is probablybetween 430 and 464. This score rangewas calculated by adding and subtracting17, the standard error of measurement,from the score of 447.

The standard error of measurementgives us a good idea of score accuracy.In the last example the true score wasdescribed as probably falling within 17points of the score the student got onthe test; for a score of 630 on the sametest, the standard error of measurementis a much bigger number: 64. In thiscase, the student’s true score fallsbetween 566 and 694. There is probablya very big difference in TABE readingknowledge between a 566 score and a 694, so it would be harder to interpreta student’s knowledge within such a largerange. The size of the standard error ofmeasurement is in large part dependenton the reliability of the test, which wasexplained previously.

A B A S I C P R I M E R F O R U N D E R S T A N D I N G S T A N D A R D I Z E D T E S T S & U S I N G T E S T S C O R E S

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 3 V O L U M E 1 6

ConclusionsConcepts like reliability, validity,

test score scales, and standard error ofmeasurement give meaning to numbersthat on their own might not mean much.Of course, the score that someone getson a test is just one piece of informationthat tells what he or she knows and isable to do in one very specific and care-fully defined subject area. While tests andtest scores are important, and it is impor-tant to try your best on any test youtake, it is also important to rememberthat any one test score is just that: onetest score. The sidebar rules for interpret-ing test scores given in this article mighthelp you use test scores in meaningfulways.

Are all tests as good as they shouldbe? Do all tests provide useful informa-tion? Unfortunately the answer is “no,”but researchers at UMass, working in collaboration with the MassachusettsDepartment of Education, Adult andCommunity Learning Services, are strivingto create tests for ABE students that pro-duce scores that are reliable and can helpus make valid decisions about studentsand programs. Our efforts are focused onmaking sure the numbers that are testscores – whether from the REEP, theBEST, the TABE, or any new tests that willbe developed – are as meaningful anddependable as possible.

Interpreting Test Scores Rule 4:

For standardized tests,

look in the technical

manual and find out

the standard error of

measurement. If it seems

like a small number

relative to the test score

scale, you can be more

confident in the accuracy

of the test score than

if it is a big number

relative to the test score

scale.

Page 36: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A B A S I C P R I M E R F O R U N D E R S T A N D I N G S T A N D A R D I Z E D T E S T S & U S I N G T E S T S C O R E S

April L. Zenisky is Senior Research Fellowand Project Manager at the Center forEducational Assessment at UMassAmherst. Her research interests includecomputer-based testing methods anddesigns, applications of item response the-ory, and innovations in test item formats. Lisa A. Keller is Assistant Professor in the Research and Evaluation MethodsProgram and Assistant Director of theCenter for Educational Assessment at UMass Amherst. Her research interests

A D V E N T U R E S I N A S S E S S M E N TPA G E 3 4V O L U M E 1 6

include Bayesian statistics, computerizedadaptive testing and item response theory.Stephen G. Sireci is Associate Professor inthe Research and Evaluation Methods Program and Co-Director of the Center for Educational Assessment in the School of Education at UMass Amherst. He isknown for his research in evaluating test fairness, particularly issues related to content validity, test bias, cross-lingualassessment, standard setting, and sensitivity review.

Page 37: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 5

dult literacy practitioners now collect and report various levels of data to meet state and federal

accountability requirements. There is no denying that these tasks areburdensome; however, many programssuch as ours are moving beyond simplecollection and reporting and using thedata to strengthen our programs. In thisarticle I hope to show that measuring and understanding student performancehelp programs, partnerships, and government to demonstrate and promote our true impact on our commu-nities. Given the increased competitionfor funds, our field’s best hope for maintaining our funding might be to provide hard evidence of our students’ success.

Our program staff have come to believe that data analysis has the potential to:

• help us identify areas of strength and areas for improvement,

• provide information that can help to deliver services in the most effective and efficient manner,

• provide us and our community partners with information about the value of our programs,

• help us to make informed policy decisions,

• enable us to focus on results, • help us operate in a way that attracts

professional staff,

V O L U M E 1 6

Luanne Teller and her

staff at the Stoughton

(Massachusetts) Adult

Education Program have

been digging deeper and

deeper into attendance

and other data and using

them to strengthen their

program.

Using Data for Pro g ram Improve m e n t

• enable us and our community partnersto take pride in our accomplishmentsand to enhance our roles in the community,

• ensure that formally reported information is accurate, and

• help us to retain and increase funding.

We all face limited time and resources.At our program we have decided to incor-porate data into our decision-makingprocess to insure that the strategies we adopt will be effective. At this point, I would like to illustrate how our staffused data analysis to support a programimprovement goal. Here are the majorphases of that effort, with special mention of where and how we used data to support it.

Our goal: To improve student attendance

Just over a year ago, I met with ourstaff, students, and members of our community partnership to understand the issues that affect student attendance.Because research shows a direct correla-tion between the hours of instruction and learning gains, we decided toincrease the intensity of learning. As anew, small ESOL program, we offered onlysix hours of instruction a week. Morethan 80% of our students work full-time,so they are unable to commit to addition-al structured hours of classroom instruc-tion. We asked ourselves what strategieswe could devise to support learning inand out of the classroom.

aB Y L U A N N E T E L L E R

Page 38: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G D A T A F O R P R O G R A M I M P R O V E M E N T

A D V E N T U R E S I N A S S E S S M E N TPA G E 3 6V O L U M E 1 6

We checked the attendance dataThe attendance issue was initially

raised by instructors who expressed con-cerns about the disruption to class conti-nuity caused by students who arrivedlate, left early, or had sporadic atten-dance. Discussions with students echoedthese concerns. A review of our SMARTTdata revealed some troubling trends; forone thing, many students were not usingthe six hours of class time we were pro-viding for them. Students also reportedfrustration with empty seats, especiallyknowing how long it took for friends andfamily to get off the waiting list and intothe program.

Further data analysis revealed thatattendance problems fell into two specificcategories. First, we had students whoattended every night, but consistentlyarrived 20-30 minutes late. While 30 minutes might not seem like much, at the end of the year it is the equivalentof missing almost six full weeks of class-es. Second, we had students with excel-lent attendance who suddenly left forextended periods of time, often due to illness or the need to return to theirnative countries for family emergencies.

In our staff meetings, we decided todevelop an attendance policy (like I said,we’re a new program!). At this point, ourcommunity partners offered a wealth ofexperience and a range of policy optionsto consider. From all this information wedeveloped a draft policy and distributedit to all the stakeholders for feedback,which resulted in our adding a provisionfor leaves of absence.

We then met with each class to reviewthe new policy. Students in our programwere already accustomed to signing aStudent Learning Agreement, stating that

they agree to follow the policies and pro-cedures in the Student Handbook. Whenwe added the new Attendance Policy tothe handbook, we were careful to allowample opportunity for student discussionand questions before asking them to signthe agreement.

We developed several other means of providing more “intensity” withoutadding class time:

• We implemented a sign-in/sign-outprocedure for when students arrivelate or leave early. (The number ofstudents arriving late and leaving earlyimmediately decreased).

• We developed a lending library so stu-dents could take books home foradditional practice.

• The Stoughton Public Library, a part-ner, began offering ESOL BookDiscussion and Conversation Groups.They also offered to house the lendinglibrary during the summer when classes are not in session, so studentscould continue to have access to materials.

• The LVA-Stoughton began to providetutors for some of our lower level stu-dents. (Some of our advanced stu-dents volunteer as tutors for the LVAas well, giving them additional Englishpractice.)

We looked at the data againAnother analysis of our attendance

data revealed a huge falloff in attendanceduring the December-January holidays. In the face of this reality, we revised ourprogram schedule to include a longer holiday break in December, which alsoprovides more time for our staff to planclasses and regroup.

Page 39: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

We monitored our results over timeAfter we implemented all the atten-

dance policy changes, we began to moni-tor the results on a monthly basis. Weused the SMARTT attendance reports toidentify students whose attendance wasnot satisfactory. To address that problem,we developed a Monthly AttendanceReport for the counselor, who now meetswith these students to provide strategiesand support where possible. Most stu-dents who receive a verbal warningimprove their attendance, which is easilytracked by comparing monthly attendancereports.

After the first year, the data revealedthat our attendance rate had increased by12%! We also began to wonder if, in fact,the increased attendance had generated aproportionate increase in learning gains.Thanks to SMARTT, COGNOS, and otherdata reports, we were able to demon-strate that there is indeed a direct correla-tion between increased hours andincreased student gains. It’s much morerewarding to know that relationship existsthan to suspect that it exists.

We were also hit with a big surprise.While our attendance had increased, ourretention had decreased. Although thenumbers were not significant enough to cause panic, they revealed a need toaddress student retention. Once again, we gathered exit interview data from students to understand what preventedthem from staying in the program. Welooked at trends to understand underwhat conditions students tended to leave.As a result, this year we implemented a new intake and orientation procedure.While our year-to-date retention is higher,the true impact can only be measured at the end of the year when we look at the “big picture” relationship among

U S I N G D A T A F O R P R O G R A M I M P R O V E M E N T

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 7 V O L U M E 1 6

attendance, retention, learning gains, and student progress towards goals.

We are also beginning to improvestrategies for transitioning students out of the program. Sometimes, studentretention is not a good thing! Some students never feel ready to leave, butnow when we arrive at a point where wecan no longer serve their needs, we haveconstructive ways to encourage them to take the next step.

We’ve gotten into the data habitData monitoring is now a regular part

of our work life and informs virtually allof program decisions. For example, atstaff meetings we distribute class atten-dance rates to our instructors, with comparisons to the prior fiscal year andto state averages. When we discoveredthat one of our instructors consistentlymaintains attendance at around 90%, we started to look at what we all canlearn from her!

Basically, we look at our data in twoways: we compare our averages to stateaverages, and we compare our own dataacross fiscal years. While state averagesare interesting to see, we tend to be more focused on continuous improvement. Looking at data from year to year helps us better understandthe impact of our current plan and discover new areas to consider forimprovement.

When gathering data, we often find it helpful to substitute the word “data”with the word “information.” The dataavailable in SMARTT and COGNOS isinvaluable, but sometimes it is overwhelming anecdotal information that informs our planning. A recent example of this is related to our processfor intaking students from the waiting list.

Page 40: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

U S I N G D A T A F O R P R O G R A M I M P R O V E M E N T

I asked my staff why it was taking so long to fill slots in our beginner ESOLclass, since we have over 150 students at this level on our waiting list. I learnedthat when these students were called,they typically hung up because they didn’t understand us or thought wewere telemarketers! To buy some time, we asked current students to translate for us while we looked for a morepermanent solution to this long-termproblem.

After several meetings, we decided to create a “We would like you to beginclass” post card with our logo and phone number on it. During initial registration/assessment, students areasked to fill in their mailing address on the post card. We explain that whenwe have an opening, we will mail themthe post card with the date for them to begin. When the post card arrives, the student immediately recognizes it and makes the connection with ourprogram. This has resulted in a muchmore efficient, equitable way to enrollstudents from the waiting list. Whileanecdotal data drove this process, themeasure of its success will come fromhard data. We will document the lengthof time it takes to enroll new studentsand determine the number of studentsenrolled per contact this year comparedwith last year.

A D V E N T U R E S I N A S S E S S M E N TPA G E 3 8V O L U M E 1 6

We’ve learned that to use data consistently and effectively we’ve had to “institutionalize” its use. To do thatwe’ve had to put the following steps into practice:

• Plan for data analysis. We’ve learnedthat good data analysis cannot justhappen episodically. We’ve had to set up meeting schedules, choose participants and include them in theprocess, find meeting space, and prepare copies of data reports.

• Identify data leaders. For most of us,understanding data is an acquiredskill. We’ve identified people in ourprogram who are skilled at puttingdata into context and understandingwhat it is trying to tell us—and letthem take the lead!

• Celebrate success. If the data pointsto success, don’t forget to take pridein it!

Luanne Teller is the Director of theStoughton Adult Basic Education Program,a collaborative partnership amongMassasoit Community College, theStoughton Public Schools, and the Town of Stoughton, funded by a grantthrough the Massachusetts Department of Education.

Page 41: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

AC L S, SA B ES, and UMASS: Pe r fect To g et h e r !

A D V E N T U R E S I N A S S E S S M E N T PA G E 3 9

ight years ago, I was theSenior Psychometrician forthe GED Testing Service. My job was to ensure the

technical quality of the Tests of GeneralEducational Development (GED), whichincluded making sure the score conver-sion tables were accurate, making surethe test items were clear and that theywere testing what they were supposed to, and conducting research on the psychometric quality of the tests (i.e.,studies of score reliability and validity;see Zenisky et al. in this issue). In work-ing with educators throughout the United States to develop and review GED Test items, I fell in love with the adulteducation community. I became wellaware of the dedication of ABE instructorsand staff, as well as the amazing successstories of the millions of ABE studentsacross the country.

When I left the GED Testing Service inWashington, DC to come to the Universityof Massachusetts at Boston in 1995, I wondered whether I would again havethe opportunity to work with the ABEcommunity. Thanks to Bob Bickerton andhis staff at Adult and Community LearningServices (ACLS) of the MassachusettsDepartment of Education, I am proud tosay that ABE is once again a major partof my life. And this time I have an army

V O L U M E 1 6

Stephen Sireci offers

an upbeat overview

of the projects he

and his colleagues

at UMass Amherst are

working on with MassDOE,

SABES, and adult basic

education practitioners.

B Y S T E P H E N J . S I R E C I

of psychometric professionals and students to help out, among the most talented testing professionals with whomI have ever worked. In my new role collaborating with ACLS, I have also discovered a new set of colleagues whogo by the strange acronym SABES. TheseSABES folk are also a pleasure to workwith, and are dedicated to improving education and assessment in ABE. In theremainder of this article, I will describeexciting events happening right now thatstem from collaborations among UMass, ACLS, and SABES.

In January 2003, ACLS contracted with the UMass Center for EducationalAssessment to help improve the assessment of ABE students inMassachusetts and to assist with theirongoing refinements of the processesused to evaluate and monitor all ABE programs in Massachusetts. Since thattime we have written more than a dozenreports for ACLS, worked with teams of educators across the state to designnew assessments in math and reading,developed and validated new prompts for the REEP writing assessment, and provided a comprehensive set of recommendations to ACLS for enhancing their program monitoringprocesses. Brief descriptions of three of our major activities follow.

e

Page 42: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A C L S , S A B E S , A N D U M A S S : P E R F E C T T O G E T H E R !

Take out your Number 2 pencils: newABE assessments are coming!

ACLS and SABES have worked hardover the past several years to come upwith ways to meet the federal govern-ment’s requirements for the demonstra-tion of the effectiveness of ABE programs.Presently, the U.S. Department ofEducation requires ABE programs to usetest scores as one means of demonstrat-ing that students are learning. ACLS andSABES convened a group ofMassachusetts ABE educators called thePerformance Accountability Working Group(PAWG) to review currently available teststhat were suitable for adult learners in Massachusetts. The final report produced by the PAWG is available atwww.sabes.org/resources/pawgfinal.pdf. In that report, the PAWG concluded thatcurrently available tests were insufficientfor the various needs of ABE studentsand programs in Massachusetts. They rec-ommended a set of tests, including theTABE, BEST, and REEP, to be used on aninterim basis until ACLS could developnew assessments targeted to the recentlydeveloped Curriculum Frameworks.

The development of new assessmentsfor ABE students in Massachusetts is oneof the key activities we are working on.Our vision is to mobilize ABE instructorsand staff across Massachusetts to help usdevelop these tests. Our initial test devel-opment efforts are in the areas of mathand reading, and recently we worked withtwo groups of ABE educators to decidewhat these tests should look like. Onegroup helped us develop specificationsfor the math tests; the other grouphelped us develop specifications for thereading tests. Our next steps are to holdseveral item-writing workshops for ABE

A D V E N T U R E S I N A S S E S S M E N TPA G E 4 0V O L U M E 1 6

instructors across the state and ask them to write items for us. Thus, ABEinstructors in Massachusetts will be the ones who develop the forthcomingtests.

Will our collaborative efforts producetests that ABE students love to take?Well, probably not love to take; however,we are confident that the tests we aredeveloping will be similar to what stu-dents are learning in their classes andwill be appropriate for measuring theirknowledge and skills. We are also confident that these item-writing workshops will provide valuable professional development for ABE educators. We plan to hold 5-10 workshops over the next year. Check the SABES Website at www.sabes.orgperiodically for announcements.

Making the REEP deepMany students throughout

Massachusetts strive to improve theirwriting. Many of these students write inlanguages other than English, but are tak-ing classes to improve their writing inEnglish. ACLS uses the REEP writing test,developed by the Arlington Education &Employment Program, to measure howmuch students’ writing improves afterreceiving instruction in ABE classes.

A key feature of the REEP (or any writ-ing test) is the prompt, which is the topicto which students are asked to respond.An example of a prompt is, “Write a letterto someone about your most recent vaca-tion.” The prompt on a writing test givesthe students something to write aboutand allows a plan to be developed forscoring the essays written in response tothat prompt. A year ago, there were onlytwo prompts associated with the REEP.

Page 43: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

Thus, with respect to prompts, the REEPwas not very “deep.” Students who need-ed to take the REEP more than twice hadto respond to the same prompts over andover again. ACLS asked us to developnew prompts for this test. Because these new prompts would also be usedto measure students’ improvement inwriting, they needed to be equivalent to the two prompts that were currently in use. After all, if we developed a newprompt that was harder to respond to,students’ newer essays might appear to be worse than their earlier essays.

I’m pleased to report that last springwe pilot-tested four prompts and one was selected for the pool of REEPprompts, expanding it by 50%. During the fall of 2003, we pilot-tested nine new prompts and four of them wereapproved for addition to the REEPprompt pool. In just one year, the numberof REEP prompts expanded from two to seven. We were able to accomplishthis goal by calling upon MassachusettsABE teachers to send us ideas forprompts and administering experimentalprompts to their students. ABE studentsalso helped us by writing essays to theexperimental prompts. Finally, we usedSABES’s network of certified REEPscorers to score the experimental essays.The new prompts were selected after a comprehensive set of statistical andqualitative analyses that led us to con-clude they are comparable to the twooriginal prompts with respect to difficultyand scorability. The technical detailsregarding the prompt tryout and selectionprocedures are available in two reportswe prepared for ACLS.

A C L S , S A B E S , A N D U M A S S : P E R F E C T T O G E T H E R !

A D V E N T U R E S I N A S S E S S M E N T PA G E 4 1 V O L U M E 1 6

Monitoring program monitoringA third project we are working on with

ACLS is improving the monitoring of ABEprograms throughout the state. ACLS isrequired to monitor all ABE programs tosee whether they are doing a good job inaccomplishing their goals and to reportprogram evaluation information back tothe federal government as part of theNational Reporting System. Over the pastyear, we followed ACLS staff on severaloccasions when they gathered informationon program quality. We also conducted a survey of ACLS staff members who perform program monitoring and surveyed programs that had recently been monitored. Finally, we took a close lookat the instrument used to record program-monitoring data. Using the information we gathered from our observations ofprogram monitoring and the survey data,we made several suggestions for revisingthe Program Monitoring Instrument.Presently, we are working with ACLS on revising the instrument to make it more efficient.

Introducing UMassThe above descriptions are just brief

glimpses of the activities we are workingon with ACLS and SABES. At the begin-ning of this article, I wrote a lot aboutmyself. Before closing, I would like to write a few words about my terrific colleagues at UMass who are also working to improve assessment and evaluation in ABE programs. There aretwo senior staff members associated with this project: April Zenisky andMercedes Valle. Both April and Mercedesare experienced in test development and statistics and are working tirelessly

Page 44: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

A C L S , S A B E S , A N D U M A S S : P E R F E C T T O G E T H E R !

on the project. There are also several graduate students who are working on the project, including Peter Baldwin, Rob Keller, Drey Martone, and Shuhong Li.In addition, Professors Ronald Hambleton,Lisa Keller, and James Royer arecontributing to the project. So, when I mentioned an army of psychometric professionals and students, I was not that far off. We all hope to meet andinteract with many of you over the comingmonths. If you would like to learn moreabout us, please visit our web site at www.umass.edu/remp.

A D V E N T U R E S I N A S S E S S M E N TPA G E 4 2V O L U M E 1 6

Stephen G. Sireci is Associate Professor in the Research and Evaluation MethodsProgram and Co-Director of the Center for Educational Assessment in the School of Education at the University of Massachusetts Amherst. Before UMass, he was Senior Psychometrician at the GED Testing Service, Psychometrician for the Uniform CPA Exam, and ResearchSupervisor of Testing for the Newark, NJBoard of Education. He is known for hisresearch in evaluating test fairness, particularly issues related to content validity, test bias, cross-lingual assessment, standard setting, and sensitivity review.

Page 45: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider
Page 46: Adventures in Assessment, vol. 16abspd.appstate.edu/sites/abspd.appstate.edu/files/...are better ways to measure ability and readiness. For purposes of discussion, let’s consider

T H E A D V E N T U R E C O N T I N U E S . . .

A D V E N T U R E S I N A S S E S S M E N TPA G E 4 4V O L U M E 1 6