Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts...

48
Standard Setting

Transcript of Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts...

Page 1: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard Setting

Page 2: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

What is Standard Setting?

• It’s a judgmental process, in which qualified experts (usually mostly or all teachers) determine “How much is enough”

• It uses an established set of activities designed to lead the panelists through the process in a consistent and systematic way

Page 3: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

What is Standard Setting?

• So what does this mean exactly?

• Let’s look at an example

• But first…

Page 4: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

A Disclaimer

• Note that the hypothetical (i.e., fake) math test presented on the following slides has clearly never been seen by Measured Progress’s excellent content experts, editors, marketing folks, high-up decision-makers, etc., none of whom would ever let me get away with this for a variety of excellent reasons.

• However, for purposes of illustration…

Page 5: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

For Example

• Let’s say we have a test, the Measured Progress Math Test, consisting of:– 40 multiple-choice (MC) items– 6 one-point short-answer (SA) items, and– 1 four-point constructed-response (CR) item

for a total of 50 points.

Page 6: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Math Test, cont.

• We need to use scores on the Math Test to meet federal accountability requirements as defined by NCLB

• There are lots of different types of test “scores:” raw score (i.e., number right), scaled scores, θ scores (which you’ll learn about next week when Mike talks about Item Response Theory), etc.

• NCLB requires that we report the percentage of students who “meet standards”

Page 7: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Math Test, cont.

• To do this, we need to establish cut points to define our four performance levels: – Advanced (A)

– Proficient (P)

– Below Proficient (BP)

– Failing (F)

Cut point 3Cut point 2

Cut point 1

Page 8: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Math Test, cont.

• To help us with this task, we have general performance level descriptors, that tell us what it means to be in each of the four PLs (with apologies to one of our major contracts):– A: Students demonstrate in-depth understanding and

can solve complex problems– P: Students demonstrate solid understanding and can

solve routine problems– BP: Students demonstrate partial understanding and

can solve some simple problems– F: Students demonstrate minimal understanding and

cannot solve problems

Page 9: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard Setting for the Math Test

• Goal is to “operationalize” these general performance level descriptors:– What does “in-depth understanding” or “solid understanding” or

“partial understanding” mean?– What distinguishes “complex,” “routine,” and “simple” problems?– What specific skills correspond to each of these general

performance level descriptors?– And, finally, what does this translate into in terms of performance

(i.e., scores) on the Math Test?

• All of these questions illustrate why standard setting is a judgmental process and why we need content experts (teachers) to set standards

Page 10: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

What is Standard Setting?

• So, standard setting is simply an established process that takes panelists through a series of systematic steps:– Becoming very familiar with the test and what it

measures– Defining in specific terms what it means to be

“Advanced,” “Proficient” or “Below Proficient” and coming to consensus about those definitions as a group

– “Operationalizing” those definitions by determining the test scores that indicate a student has demonstrated the necessary knowledge, skills and abilities (KSAs) to be classified as (for example) Proficient

Page 11: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

What is Standard Setting?

• There are a variety of methods that can be used to accomplish these goals, and a myriad of variations on these methods

• So how do we decide which method to use?

Page 12: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Selecting a standard-setting method

• As mentioned on the previous slide, there are lots and lots of standard-setting methods out there:– Bookmark (and Modified Bookmark)

– Angoff (and Modified Angoff)

– Body of Work

– Analytic Judgment

– Dominant Profile

– Contrasting Groups

– and so on…

Page 13: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Selecting a method, cont.

• Choosing among these options is a matter of one or more of the following:– Previous history– Policy directive or recommendation– Appropriateness for item types, for example:

• Bookmark works well for tests that contain mostly MC items• Body of Work works well for tests that include a lot of CR items

– etc.

• No method is right or wrong; task is to select the most appropriate for a given situation

• Different methods yield different results.

Page 14: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Selecting a method, cont.

• For our test, the Math Test, we’re going to use the Bookmark Method, because the test consists primarily of MC items, but also includes a few SA items and 1 CR item.

• However, we also need to set standards for the Math Test-Alt. The Math Test-Alt consists entirely of polytomous items, so we’re going to use the Body of Work Method

Page 15: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Bookmark vs. Body of Work

• Bookmark:– Review: Best for tests with mostly __ items– A test-centered method, i.e., rating decisions

are based primarily on the test items rather than samples of student work

– Uses an Ordered Item Booklet and an Item Map– Panelists make their ratings by placing

bookmarks in the OIB

Page 16: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Bookmark vs. Body of Work, cont.

• Ordered Item Booklet:– Each page in the booklet is a single item (or a

single score point for polytomous items)– Items are presented in order from the easiest

item on the test to the hardest– Total number of ordered items = total possible

raw score on the test– Order is based on actual student performance

(determined using IRT – come back next week)

Page 17: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Ordered Item #1 (easiest item on the test)

2+2=a) 4*b) 5c) 2.2d) 22

Page 18: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Ordered Item #2* (second easiest item on the test)

a) Won needs 54 tiles for a new kitchen floor. The tiles come in boxes of 8 tiles each. How many boxes of tiles will Won need to cover the floor?

b) One box of tiles costs $21.87. Estimate the amount Won will have to spend.

Explain how you got your answer.

Score point 1: Student response contains minimal evidence of understanding of number sense and operations

*With apologies to one of our major contracts

Page 19: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Ordered Item #2

• Example of a 1-point response:

Page 20: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Ordered Item #XX(XXth easiest item on the test)

a) Won needs 54 tiles for a new kitchen floor. The tiles come in boxes of 8 tiles each. How many boxes of tiles will Won need to cover the floor?

b) One box of tiles costs $21.87. Estimate the amount Won will have to spend.

Explain how you got your answer.

Score point 2: Student response contains fair evidence of understanding of number sense and operations

Page 21: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Ordered Item #XX

• Example of a 2-point response:

Page 22: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Sample Item Map

OI #What KSAs does the student need to answer this question?

Why is this question more difficult than the previous one?

1

2

3

50

Page 23: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Bookmark vs. Body of Work, cont.

• Body of Work:– Review: Best for tests with primarily __ items

– A student-centered method, i.e., rating decisions are based on intact samples of actual student work

– Sets of student work are presented in order from the lowest-scoring BOW to the highest-scoring

– Panelists make their ratings by sorting the BOWs into four piles (corresponding to the PLs)

Page 24: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard-Setting Process

Prior to the Meeting

Page 25: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Creating Performance Level Descriptors

• Sometimes (if not often), the PLDs are not much more specific than the ones presented earlier for the Math Test.

• The less specificity these have going into standard setting, the more work panelists must to do establish an understanding of the definitions for each PL

Page 26: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Selecting Panelists

• Usually aim for 10 to 20 panelists per group, however…– Panelists are expensive– Panelists are sometimes hard to come by

As a result, group sizes are often closer to 10 than 20, and are sometimes even smaller

• Panels usually consist mostly of teachers, but can also include administrators, parents, business or community leaders, legislators, etc.

• Panelists should be chosen to be representative of all important stakeholder groups in terms of: ethnicity, gender, geographic location, rural vs. urban area, etc.

Page 27: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Training Facilitators

• Standards for each grade/content combination are set by a separate panel, so it’s important that the group facilitators are following the process consistently

• Prior to the meeting, a Facilitator’s Script is prepared and a training meeting is held to make sure facilitators have a common understanding of the process

Page 28: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard-Setting Process

During the Meeting

Page 29: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Orientation/Training

• The standard-setting meeting (regardless of the method being used) starts with an orientation session that is attended by all panelists

• The session includes background information about the assessment as well as an overview of standard setting and the process they will be going through

• After the opening session, panelists break up into their grade/content area groups; each group is in a separate room

Page 30: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Taking the Test/Reviewing Test Materials

• Once in their grade/content area groups, the panelists for the Math Test will start by taking the test

• For the Math Test-Alt, there isn’t really a test in the same sense, so the panelists will review the test materials

• This step ensures that panelists are very familiar with the test content and what students who take the test experience

Page 31: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Review PLDs

• Here is where the panelists determine what it means to be “Below Proficient,” “Proficient,” or “Advanced”

• They review the PLDs that are provided to them, and they discuss the specific KSAs students must demonstrate in order to fall into each category

• Often, they will create bulleted lists for each level that are then posted on chart paper for them to refer to as they do the rating process

• It is critical that panelists come to consensus about these definitions

Page 32: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Completing the Item Map (Bookmark)

• For the Math Test, the panelists will then review the ordered item booklet and fill in the item map

• On the item map, for each ordered item, they will write the KSAs required to successfully complete that item and why it is more difficult than the one before (remember, the items are presented in order by difficulty)

• This will help them tie the items back to the PLDs they worked on in the previous step which will help them when it comes time to place their bookmarks

Page 33: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Rating Process - Bookmark

• For the Math Test, panelists start with the lowest cut and, working their way through the OIB, ask themselves “Would a student who’s just barely over the line into ‘Below Proficient’ have at least a 2/3 chance of getting this item right?”

• For OI #1, the answer will probably be yes (although not necessarily); as the items get harder, at some point, the answer will change to no. This is where the panelist places his/her bookmark.

Page 34: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Wait a minute – not so fast!

• What do you mean at some point the answer will change to yes? Is it really that clear-cut?– No, of course not. There will be gray areas.

• And what’s this whole “2/3 chance” business?– I’m glad you asked that question. It reflects the

probabilistic nature of the IRT model that’s used to estimate the difficulty of the items and order them in the OIB.

Page 35: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Rating Process – Bookmark, cont.

• Once the panelists have placed the bookmark for the first cut (F vs. BP), they will repeat the process for the middle cut (BP vs. P) and, finally, the top cut (P vs. A)

Page 36: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Rating Process: Body of Work

• For the Math Test-Alt, the panelists start with the first BOW in the pile (the lowest-scoring BOW) and compare the KSAs the student has demonstrated to the PLDs and decide which PL matches that student’s performance best

• For the first BOW, the answer will probably be F (but not necessarily).

• They will work their way through the entire set of BOWs and classify each one into one of the four piles

Page 37: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Rating Process: Both Methods

• For both methods, ratings are done in three rounds (although that isn’t the case for all standard settings…)

Page 38: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Round 1

• Work is done individually, without any consultation with other panelists

• Once the panelists have completed their Round 1 ratings, they fill in the Round 1 rating form

• R&A analyzes the results and calculates the group average cut points

Page 39: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Round 2

• Panelists discuss the Round 1 results, including the average cut points, and share their rationale for how they did their ratings

• Once the Round 2 discussions are complete, panelists fill in the Round 2 rating form

• R&A again analyzes the results and calculates the group average cut points and impact data: the percentage of students who would fall into each of the PLs based on the Round 2 average cuts

Page 40: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Round 3

• Round 3 is very similar to Round 2, except that now the panelists have the impact data to consider as part of their discussions– Panelists are cautioned against basing their decisions

solely on the impact data

• Once the Round 3 discussions are complete, panelists fill in the Round 3 rating form

Page 41: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Math TestRound 1 Rating Form

ID _____________

FOrdered Item

Numbers

First Last

1 ___

BPOrdered Item

Numbers

First Last

___ ___

POrdered Item

Numbers

First Last

___ ___

AOrdered Item

Numbers

First Last

___ 50

Page 42: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Math Test-Alt Round 1 Rating Form

BOW F BP P A

1

2

3

4

etc.

Page 43: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Some Technical Arcana

• Group Average Cutpoints for the Bookmark Method are determined using the IRT-based difficulty values. (Remember those? That’s what we used to order the items.)

• Group Average Cutpoints for the BOW Method are calculated using Logistic Regression

• Impact data are based on actual student performance on the test the last time it was administered

Page 44: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Evaluation

• After completing Round 3 of the ratings, panelists are asked to complete an evaluation of the standard-setting process.

Page 45: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard-Setting Process

After the Meeting

Page 46: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Calculating Final Cut Points

• R&A calculates the Round 3 Cut Points and presents the results to the client for approval

• Sometimes, R&A may recommend adjustments to the cut points– This happens most commonly when standards are set in

multiple grades and it is undesirable to have standards that vary substantially across grade levels

– In this case, we might smooth the results • Panelists are told at the beginning of the process

that their cuts will be recommendations, and the final results may differ somewhat

Page 47: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Analyzing the Results of the Evaluation

• The results of the panelists’ evaluation forms are compiled and reviewed

• On rare occasions, this review may identify an individual panelist whose ratings should be excluded from the results

Page 48: Standard Setting. What is Standard Setting? Its a judgmental process, in which qualified experts (usually mostly or all teachers) determine How much is.

Standard-Setting Report

• The final step in the standard-setting process is to write up the process used and the results in a standard-setting report