Evaluating tests and examinations What questions to ask to make sure your assessment is the best...

26
Evaluating tests and examinations What questions to ask to make sure your assessment is the best that can be produced within your context. Dianne Wall Lancaster University

Transcript of Evaluating tests and examinations What questions to ask to make sure your assessment is the best...

Evaluating tests and examinations

What questions to ask to make sure your assessment is the best that can be produced

within your context.

Dianne WallLancaster University

Write test specification

Does the test specification include, amongst other things

- a statement about the purpose of the test?- a description of the candidates for whom the test is

intended?- a clear description of test content – skills to be

tested, topics, text-types, language structures, vocabulary, etc?

Write test specification (cont)

- a clear list of the types of techniques that will be used?

- a description of the level of difficulty of the test?

- the criteria that will be used to evaluate the candidates in writing and speaking?

(See Alderson, Clapham and Wall 1995 for detailed discussion of specifications)

Design test: tasks, procedures, marking scheme, criteria

Are there written guidelines for item writers?

Do these guidelines include, amongst other things

• the standard form rubrics should take?• advice on how to choose texts? (text type,

length, readability…)

Design test (cont)

• advice on how to use certain item types?• information about how many items are

needed in each section?• etcetera

(See Hungarian Examination Reform website for excellent examples of guidelines for item writers.)

Moderate test

Is there a committee that meets to discuss draft tests?

Does the committee

• check whether the items and tasks match the specification?

Moderate test (cont)

• check items for clarity of instructions, level of difficulty, complexity and type of texts, etc?

• check the items and tasks are not biased in terms of gender, culture, background knowledge, etc?

Pre – test and analyse results

• Are draft tests tried out to see whether the items and tasks work in the way that was intended?

• Are they the right level of difficulty?• Are the instructions clear?• Is the layout right?• Is the timing right?

Objective tests

• Are the items pre-tested on a representative sample of the candidature?

• Is the marking scheme user-friendly? accurate? complete?

• Are relevant statistics (facility value, discrimination index, reliability coefficient, etc) calculated?

Subjective tasks

• Are the writing and speaking tasks tried out to see whether the hoped-for language is produced?

• Are the rating scales user-friendly? appropriate for the tasks? complete?

Parallel tests

• If parallel versions of a test have been constructed, are they statistically checked for equivalence?

Revise test

• Do you revise your test in light of the results of pre-testing?

• If so, do you pre-test the revised version?

Print and dispatch test materials

• Do you secure facilities and services to make sure there are no security leaks?

Train examiners (interlocutors, assessors), invigilators etc.

Do you have guidelines for all the key personnel working on your test?

• Do interlocutors understand the point of all the speaking tasks?

• Do they know what order the different speaking tasls come in?

• Are they able to keep track of all the test materials?• Do the interlocutor and assessor have an agreed signal

for indicating that the candidate has produced enough language and can go to the next stage?

Administer test to candidates

• Do you have guidelines for administering your test –e.g. how the furniture is to be arrabged, where the recording equipment is to be located, what to do if anything goes wrong?

• Do you monitor administrations to find out whether everything went according to plan?

• Do you gather feedback to find out what the invigilators, examiners, students etc thought about the administration?

Train and monitor markers

Do you hold a meeting to standardise markers?

What normally happens during a standardisation meeting?

• marking of a representative sample of scripts, including ‘rogues’?

• discussing of and refining of marking criteria,

Train and monitor markers (cont)

• further marking and discussion• ‘test’ of markers to see if they are ready to begin

Do you have a system of monitoring marking to make sure the markers keep to the guidelines?

What proportion of the scripts are double-marked (by another marker r by the Chief Examiner?

Are inter-rater correlations routinely calculated?

Analyse test results, feedback and observations

• Are relevant statistics calculated?• If an item or a section of the test proves to

have been faulty, are the marks counted toward the candidates’ final grade?

• Are any special procedures followed to decide upon pass-fail distinctions or grade boundaries?

Analyse test results, feedback and observations (cont)

• Is feedback gathered on the test? If so, from whom?

• Do procedures exist to make sure the test is revised in the light of feedback?

Report results

Do you write reports after every test?Do the reports contain:

• information about the candidates’ performances – e.g. the percentage gaining each grade?

• information about performance on each section of the test?

Report results (cont)

• descriptions of typical problems?

• admission of error if the test is faulty?

• advice to teachers and students for the next time around?

Evaluate test and procedures

• Was the test ‘useful’?

• What can you say about its validity?

• What can you say about its reliability?

• What can you say about its practicality?

Evaluate test and procedures (cont)

• What can you say about its impact on teaching?

• What, if anything, needs to be changed before you give the test the next time?

For a more detailed quality control checklist seeAlderson, Clapham and Wall (1995) Language Test Construction

and Evaluation Cambridge: Cambridge university Press pages 266-273

Context

What type of institution do you work for?What are your purposes in testing?Who decides your institution’s testing goals?Do you have a testing policy?If so, does it contain statements concerning• proper test use• security• personnel

Context (cont)• testing infrastructure• testing procedures• commitment to quality

Do you have on-going evaluation of your tests, and revision?

Do you carry out any research into the uselfulness of your tests?

Are the results from the research available to other members of the profession, so that they can learn from your experiences???

For more information please contact

Dianne Wall

Dept of Linguistics & English LanguageLancaster UniversityLancasterUnited Kingdom

[email protected]