Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim
date post
03-Jan-2016Category
Documents
view
59download
0
Embed Size (px)
description
Transcript of Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim
Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim Ren Hackl, Ralph Klle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen, Christa Womser-HackerInformation ScienceUniversitt Hildesheim Marienburger Platz 22 - 31131 Hildesheim Germany{mandl,womser}@uni-hildesheim.deUNIVERSITTHILDESHEIM
OverviewFusion in Information RetrievalThe MIMOR ModelParticipation 2003Blind Relevance FeedbackFusion ParametersSubmitted RunsOutlook
Fusion in ad-hoc RetrievalMany studies especially within TREC show that the quality of the best retrieval systems is similarthat the overlap between the results is not very large that fusion of the results of several systemes can improve the overall performance
The MIMOR ModelCombines fusion and Relevance Feedbacklinear combination each individual system has a weightweights are adapted based on relevace feedback
+High SystemRelevancePositiveRelevanceFeedbackIncrease weight of a systemGoal:
Calculation of RSV in MIMORWeighted sum of single RSV
RSVij
Retrieval Status Value of
System i for Document j
_1028030619.unknown
Participation in CLEF2002: GIRT trackLinguistic pre-processing: LuceneBasic IR systems: irf2003: multilingual-4 trackLinguistic pre-processing: SnowballBasic IR systems: Lucene and MySQLInternet translation services
Participation in CLEF 2003Snowball for linguistic pre-processingMIMORin JAVALucene and MySQLstatic optimizationInternet translation servicesFusion" of translation services: use all terms found by any serviceBlind relevance feedback
Blind Relevance FeedbackWe experimented with: Kullback-Leibler (KL) divergence measure Robertson selection value (RSV)
Blind Relevance FeedbackBlind Relevance Feedback improved performance over 10% in all testsParameters could not be optimizedSubmitted runs use top five documents and extract 20 terms (5 /20) based on Kullback-Leibler
Sequence of operations
merged result sets
Lucene
MySql
Test Runs
Test RunsLucene performs much better than MySQL Still, MySQL contributes to the fusion result-> High weight for Lucene
Results of submitted runs
Submitted RunsLucene performs much better than MySQL MySQL does not contribute to the fusion resultLucene only best runBlind Relevance Feedback leads to improvement
Proper NamesProper Names in CLEF Topics lead to better performance of the systems (see Poster and Paper)Treat Topics with Proper Names differently (different system parameters) ?
AcknowledgementsThanks to the developers of Lucene and MySQLThanks to students in Hildesheim for implemtening part of MIMOR in their course work
OutlookWe plan tointegrate another retrieval systemsinvest more effort in the optimization of the blind relevance feedbackparticipate again in the multilingual trackEvaluate the effect of individual relevance feedback