Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.
-
Upload
agnes-pitts -
Category
Documents
-
view
217 -
download
0
Transcript of Item-writing Orientation & Review. Quality Test Item-writing Evaluation Measurement Testing.
Item-writing Orientation & Review
Quality Test Item-writing
Evaluation
Measurement
Testing
Goal of quality item-writing in a nutshell:
Examinees should get an item…,
Right - because they know the correct answer
Wrong - because they don’t know the correct answer.
1st Questions to Ask Yourself
Why am I testing?
How am I testing?
What results am I getting (or hoping to get)?
How am I going to use the results?o What kind of interpretations do you want to
make with the scores?
Bloom’s Taxonomy for the Cognitive Domain
Essays
“Objective” Formats
Cognitive Level
Multiple Choice Items
Stem: a question or incomplete sentence
A. DistracterB. DistracterC. DistracterD. Correct or Best answer (the “keyed”
response)E. Distracter
General Format to Multiple Choice Items.
Options
Item Technical Flaws
• Issues Related to Irrelevant Difficulty
• Issues Related to Testwiseness
2 classes of flaws
Irrelevant Difficulty
Flaws related to irrelevant difficultymake the question difficult for reasons unrelated to the trait that is the focus of assessment
Irrelevant Difficulty
Grammatical Inconsistencies:
one or more of the distracters fail to followgrammatically from the stem
Grammatical Inconsistencies
A 60-year-old alcoholic in status epilepticusis brought to the emergency department bythe police. After ascertaining that the airwayis open, the first step in management shouldbe administration of:
A. examination of cerebrospinal fluidB. glucose with vitamin B1 (thiamine)C. CT scan of the headD. phenytionE. diazepam
Grammatical Inconsistencies
A 60-year-old alcoholic in status epilepticusis brought to the emergency department bythe police. After ascertaining that the airwayis open, the first step in management shouldbe administration of:
A. examination of cerebrospinal fluidB. glucose with vitamin B1 (thiamine)C. CT scan of the headD. phenytionE. diazepam
A testwise examinee would throw out A and C
Irrelevant Difficulty
Options are long, complicated, or multiple facettedPeer review committees in HMOs may move to take action against a physician’s credentials to care for participants of the HMOs. There is an associated requirement to assure that the physician receives due process in the course of these activities. Due process must include which of the following?
A. Proper notice, a tribunal empowered to make the decision, a chanceto confront witnesses against him/her, and a chance to presentevidence in defense.
B. Notice, an impartial forum, council, a chance to hear and confrontevidence against him/her.
C. Reasonable and timely notice, impartial panel empowered to makea decision, a chance to hear evidence against him/herself and toconfront witnesses, and the ability to present evidence in defense.
Irrelevant Difficulty
Options are long, complicated, or multiple facettedPeer review committees in HMOs may move to take action against a physician’s credentials to care for participants of the HMOs. There is an associated requirement to assure that the physician receives due process in the course of these activities. Due process must include which of the following?
A. Proper notice, a tribunal empowered to make the decision, a chanceto confront witnesses against him/her, and a chance to presentevidence in defense.
B. Notice, an impartial forum, council, a chance to hear and confrontevidence against him/her.
C. Reasonable and timely notice, impartial panel empowered to makea decision, a chance to hear evidence against him/herself and toconfront witnesses, and the ability to present evidence in defense.
This is actually a 5-option item but too long to get on one slide!
Irrelevant Difficulty
Numeric data are not stated consistentlyFollowing a second episode of salpingitis, what is the likelihood that a woman is infertile?A. 0 - 20%
B. 20 to 30%
C. Greater than 50%
D. 90%
E. 75%
Irrelevant Difficulty
Numeric data are not stated consistentlyFollowing a second episode of salpingitis, what is the likelihood that a woman is infertile?A. 0 - 20%
B. 20 to 30%
C. Greater than 50%
D. 90% (greater than 50%)
E. 75% (greater than 50%)
Irrelevant Difficulty
Frequency terms in the options are vague (e.g., often, rarely, usually)Severe obesity in early adolescence:
A. usually responds dramatically to dietary regimens
B. often is related to endocrine disorders
C. has a 75% chance of clearing spontaneously
D. shows a poor prognosis
E. usually responds to pharmacotherapy andintensive psychotherapy
Irrelevant Difficulty
Frequency terms in the options are vague (e.g., often, rarely, usually)Severe obesity in early adolescence:
A. usually responds dramatically to dietary regimens
B. often is related to endocrine disorders
C. has a 75% chance of clearing spontaneously
D. shows a poor prognosis
E. usually responds to pharmacotherapy andintensive psychotherapy
Irrelevant Difficulty
“None of the above” or “all of the above” is used as an optionThe diagnosis of a large ovarian cyst is most strongly suggested by an:
A. anterior dullness, lateral tympany
B. decreased peristalsis
C. fluid wave
D. shifting dullness
E. none of the above
Irrelevant Difficulty
“None of the above” or “all of the above” is used as an optionThe diagnosis of a large ovarian cyst is most strongly suggested by an:
A. anterior dullness, lateral tympany
B. decreased peristalsis
C. fluid wave
D. shifting dullness
E. none of the above
Essentially turns this into a multiple true/false item
Irrelevant Difficulty
Stems are tricky or unnecessarily complicatedArrange the parents of the following children with Down’s syndrome in order of highest to lowest risk of recurrence. Assume that the maternal age in all cases is within 5 years. The karyotypes of the daughters are:
I: 46, XX, -14, +T (14q21q) patII: 46, XX, -14, +T (14q21q) de novoIII: 46, XX, -14, +T (14q21q) matIV: 46, XX, -21, +T (14q21q) patV: 47, XX, -21, +T (21q21q) (parents not karyotyped)
A. III, IV, I, V, II
B. IV, III, V, I, II
C. III, I, IV, V, II
D. IV, III, I, V, II
E. III, IV, I, II, V
Testwiseness
The probability of answering a question correctlyshould relate to the examinee’s amount of expertise on the topic being assessed and should not relate to their expertise on test-taking strategies
Flaws related to testwiseness make it easier for some students to answer the question correctly, based on their test-taking skills alone.
These flaws commonly occur in items that are unfocused and do not satisfy the “cover-the-options” rule.
Testwise examinees work to eliminate item options in order to increase the odds of them guessing the correct response.
The Correct Answer is often:
Longer than the incorrect options
More qualified or more general
Written using familiar phraseology
More grammatically correct for item stem
1 of the 2 similar statements 1 of the 2 opposite
statements
Testwise students are aware that….
Remember to use their testwiseness against them! Use theirawareness of these tendencies for the WRONG answers.
A Wrong Answer often:
Testwise students are aware that….
is the first or last option contains extreme words (always, never,
nonsense, etc.) contains unexpected language or technical
terms contains flippant remarks or completely
unreasonable statements
Remember to use their testwiseness against them! Use theirawareness of these tendencies for the RIGHT answers.
Testwiseness
Logical Cues:
a subset of the options are collectively exhaustive
Logic Cues
Crime is:
A. equally distributed among the social classes
B. overrepresented among the poor
C. overrepresented among the middle class and the rich
D. primarily an indication of psychosexual maladjustment
E. reaching a plateau of tolerability for the nation
Logic Cues
Crime is:
A. equally distributed among the social classes
B. overrepresented among the poor
C. overrepresented among the middle class and the rich
D. primarily an indication of psychosexual maladjustment
E. reaching a plateau of tolerability for the nation
A, B, & C are mutually exclusive so D & E can be thrown out.A unlikely because few social measures are distributed equally across all social classes.
Testwiseness
Absolute Terms:
terms such as “always” or “never”are used in the options
Absolute Terms
In patients with advanced dementia, Alzheimer’s type, the memory defect
A. can be treated adequately with phosphatidylcholine (lecithin)
B. could be a sequela of early parkinsonism
C. is never seen in patients with neurofibrillary tangles at autopsy
D. is never severe
E. possibly involves the cholinergic system
Absolute Terms
In patients with advanced dementia, Alzheimer’s type, the memory defect
A. can be treated adequately with phosphatidylcholine (lecithin)
B. could be a sequela of early parkinsonism
C. is never seen in patients with neurofibrillary tangles at autopsy
D. is never severe
E. possibly involves the cholinergic system
Testwiseness
Long Correct Answer:
correct answer is longer, more specific,or more complete than other options
Long Correct Answer
Secondary gain is:
A. synonymous with malingering
B. a problem in obsessive-compulsive disorder
C. a complication of a variety of illnesses and tends to prolong many of them
D. never seen in organic brain damage
Long Correct Answer
Secondary gain is:
A. synonymous with malingering
B. a problem in obsessive-compulsive disorder
C. a complication of a variety of illnesses and tends to prolong many of them
D. never seen in organic brain damage
Testwiseness
Word Repeats:
a word or phrase is included in the stemand in the correct answer
Word Repeats
A 58-year-old-man with a history of heavy alcoholuse and previous psychiatric hospitalization isconfused and agitated. He speaks of experiencingthe world as unreal. This symptom is called:
A. derealization
B. depersonalization
C. derailment
D. focal memory deficit
E. signal anxiety
Word Repeats
A 58-year-old-man with a history of heavy alcoholuse and previous psychiatric hospitalization isconfused and agitated. He speaks of experiencingthe world as unreal. This symptom is called:
A. derealization
B. depersonalization
C. derailment
D. focal memory deficit
E. signal anxiety
Testwiseness
Convergence Strategy:
the correct answer includes the mostelements in common with the otheroptions
Convergence Strategy
Local anesthetics are most effective in the:
A. anionic form, acting from inside the nerve membrane
B. cationic form, acting from inside the nerve membrane
C. cationic form, acting from outside the nerve membrane
D. uncharged form, acting from inside the nerve membrane
E. uncharged form, acting from outside the nerve membrane
Convergence Strategy
Local anesthetics are most effective in the:
A. anionic form, acting from inside the nerve membrane
B. cationic form, acting from inside the nerve membrane
C. cationic form, acting from outside the nerve membrane
D. uncharged form, acting from inside the nerve membrane
E. uncharged form, acting from outside the nerve membrane
Since 3 of the 5 involve a charge, test wise examinees will pick “B”
General Guidelines for Multiple Choice Item Construction
Make sure the item can be answered without looking at the options.
Include as much of the item as possible in the stem - the stems should be long and the options short.
Avoid superfluous information.
General Guidelines for Multiple Choice Item Construction
Avoid “tricky” and overly complex items.
Write options that are grammatically consistent and logically compatible with the stem; list them in logical or alphabetical order.
Write distractors that are plausible and the same relative length as the answer.
General Guidelines for Multiple Choice Item Construction
Avoid using absolutes such as always, never, and all in the options; Also avoid using vague terms such as usually and frequently.
Avoid negatively phrased items (those with except or not in the lead-in). If you must use a negative stem, use only short (preferably single word) options.
Focus on important concepts; Don’t waste time testing trivial facts.
Evaluating Item characteristics
Index of Difficulty
Index of Discrimination
Index of Difficulty
The percentage of the group of examinees who answered the item correctly (p-value). The larger the value the easier the item.
• Usually expressed in decimal form (Range of 0 to 1 ).
• Is not determined solely by the content of the item, but also reflects the ability of the group responding to that item.
Index of Discrimination
The correlation between the scores on a particular item and the total score on the exam. If a large proportion of the high scoring examinees get an item correct, and a small proportion of the low scoring examinees get it right, that item has discriminated properly and has contributed to the test purpose.
• Usually expressed as a correlation coefficient ( Range - 1.0 to + 1.0 )
Ideal range for item difficulty
Discrimination is closely related to difficulty. Items that are too hard or too easy are not as capable of discriminating between high and low achievers as items of moderate difficulty.
Moderate difficulty is generally identified with index scores half-way between the prefect score and the change score.
For a 5-option multiple choice item: Perfect score: 1.0 Chance score: 0.20 ( 1 in 5 ) Moderate difficulty score: 0.60
Ideal Range for a discrimination index
o The index of discrimination can be used in the selection of the best (most highly discriminating) items for inclusion on the exam.
o According to Ebel and Frisbie (1991), the following standards should be used:
Index score Item Evaluation 0.40 and up Very good items 0.30 to 0.39 Reasonably good 0.20 to 0.29 Marginal items – could be improved Below 0.19 Poor items - should be rejected or
revised
Try to predict item analysis stats…….
Difficulty index
Discrimination index
Statistically, items of “medium difficulty” have the best chance of discriminating well.
Medium difficulty: For every 10 examinees, 6-7 get the question right
So, think about how many WILL (not SHOULD) get it right! Know your audience!
Too Easy is no good…….
Too hard is no good……
Problem: right difficulty range but STILL doesn’t discriminate