Challenges of Piloting Test Items
-
Upload
hadley-hodge -
Category
Documents
-
view
24 -
download
2
description
Transcript of Challenges of Piloting Test Items
Challenges of Piloting Test Items
Branka PetekSchool of Foreign LanguagesSlovenia
Content
1) Challenges Slovenia had to face when piloting test items
2) What we learned from experience
Why pilot test items?
To get a clear picture about candidates’
language skills.
To get a clear picture we need good test items.
Impossible to have good test items without pre-
testing.
Challenges SFL had to face
Appropriate population for pilotingAdministration of the itemsTest formatStatistical analyses
Population for piloting
SizeSimilarity to the Slovenian testing
populationLevel of proficiencyTest fatigue
Lessons learned
SIZE: the population should be as big as posible, (but) anything is better than nothing;
SIMILARITY: the population should be similar to the testing population;
LEVEL OF PROFICIENCY: normal (or near normal) distribution otherwise the results will be unreliable.
TEST FATIGUE: Have the canidates piloted before? Are they tired of taking the tests, piloting?
Administration
AdministratorsTimeCoursesCollecting data on testakers
Lessons learned
ADMINISTRATORS: the most reliable results when we administer the tests;
TIME: depends on a course cycle; COURSES: courses designed to prepare students
for STANAG tests normally give the most reliable results;
QUESTIONNAIRES: help investigate face validity of tests, time allocated, clarity of rubrics, appropriacy of test methods, text topics (if well designed).
Test format
LengthNumber of itemsTask typesTopics (cultural background,
influence of the course)
Lessons learned
LENGTH: Similar to the live test version; NUMBER OF ITEMS: approximately the
same number of items; TASK TYPES: different countries use
different methods – candidates might not be familiar with the task types we use;
FAMILIARITY WITH THE TOPICS: e.g. military topics (cultural background);
Statistical analyses
CTTIRT‘Manual check’The influence of a particular
population
Lessons learned
Small population, CTT – the only option; Sometimes less than 30 - manual checking:
odd answers and strange behaviour,
can help eliminate some problems and improve the items;
With small population the data is less reliable - always an element of risk.
Perfect & real-world of piloting
A perfect world piloting session would mean at least 300 test takers, IRT, revising test items, repilot, IRT, final version of the test and experts to determine cut-off scores.
In real world piloting is difficult to plan and carry out. Absolutely essential part of a testing cycle. Piloting internationally can produce more reliable results but
also represents many pitfalls we have to be aware of. Being aware of possible problems might help us plan. The more we invest (in the sense of time, effort and money),
the more we get.
Thank you
Questions,
suggestions?