Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund...
-
date post
15-Jan-2016 -
Category
Documents
-
view
230 -
download
0
Transcript of Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund...
Experimental Components for the Evaluation of Interactive Information
Retrieval Systems
Pia Borlund
Dawn Filan3/30/04610:551
The Goal• To evaluate IR systems in a way that is as
close to actual information seeking process as possible, while still being in a controlled environment.
Research Questions• Can simulated information needs be
substituted for real information needs?
• What makes a good simulated situation with reference to semantic openness and types of topics of the simulated situations?
Hybrid Evaluation Model• Increased demand
– Relevance revolution– Cognitive revolution– Interactive revolution
• Combine two main approaches– System-driven approach (controlled)– Cognitive user-centered approach (realism)
The Experimental Setting3 components:
• The involvement of potential users as test persons
• The application of dynamic and individual information needs
• Use of dynamic relevance judgements
Ideal IIR Setting• Real users who state personal information
needs to the system and judge the relevance of the retrieved documents under controlled circumstances.
• Use of “simulated work task”
• Must be under controlled circumstances so that results can be compared across systems and user groups.
Simulated Work Task• Triggers and develops a simulated
information need by allowing for user interpretations of the situation.
• Platform against which situational relevance is measured.
• 2 variants applied:– Complete need applied (sim 1)– Only situation applied (sim 2)
Situational Relevance• User-centered, realistic, and dynamic
measure of relevance
• Judgements are not based on the request or query, but rather relate to the person’s requirements and mental state at the time of the retrieval
• Assessed continuously and interactively during the session
Relevance (Schamber, Eisenberg, and Nilan)
• Multidimensional cognitive concept whose meaning is dependant in users’ perceptions of information and their information needs
• Dynamic concept that depends on users’ judgements of quality of the relationship between information and the information need
• Complex but systematic and measurable concept if approached conceptually and operationally from the user’s perspective
Meta-Evaluation• Should simulated work tasks be
recommended as a component of the experimental setting for evaluating IIR systems?
Meta-Evaluation Questions• Possibility of substituting real information
needs with simulated information needs through the application of simulated work task situations.
• Whether the variants of the simulated task makes any difference to the test persons’ treatment of the information need
• What characterizes a good simulated work task in terms of how tailored the task should be to the user
Test Setting• Full-text online system applying TREC data
and probabilistic-based retrieval engine
• Search activity and relevance scores were logged
• 24 users from various academic backgrounds and education levels
• Asked to prepare a personal information need
Testing Procedure• Brief questionnaire
• Introduction
• Explanation of the test person’s role
• Demo of the system
• Execution of 6 search tasks (training, real, 4 simulated tasks)
• Post-search interview
Conclusions• One can substitute real information needs
with simulated information through the application of simulated work tasks
• One can mix simulated and real information needs
• Treatment of the information need did not differ between the group that received the work task and request, and those who received just the work task.