Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim

Click here to load reader

  • date post

    03-Jan-2016
  • Category

    Documents

  • view

    59
  • download

    0

Embed Size (px)

description

UNIVERSITÄT HILDESHEIM. Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim. René Hackl, Ralph Kölle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen, Christa Womser-Hacker Information Science Universität Hildesheim Marienburger Platz 22 - 31131 Hildesheim - PowerPoint PPT Presentation

Transcript of Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim

  • Multilingual Retrieval Experiments with MIMOR at the University of Hildesheim Ren Hackl, Ralph Klle, Thomas Mandl, Alexandra Ploedt, Jan-Hendrik Scheufen, Christa Womser-HackerInformation ScienceUniversitt Hildesheim Marienburger Platz 22 - 31131 Hildesheim Germany{mandl,womser}@uni-hildesheim.deUNIVERSITTHILDESHEIM

  • OverviewFusion in Information RetrievalThe MIMOR ModelParticipation 2003Blind Relevance FeedbackFusion ParametersSubmitted RunsOutlook

  • Fusion in ad-hoc RetrievalMany studies especially within TREC show that the quality of the best retrieval systems is similarthat the overlap between the results is not very large that fusion of the results of several systemes can improve the overall performance

  • The MIMOR ModelCombines fusion and Relevance Feedbacklinear combination each individual system has a weightweights are adapted based on relevace feedback

    +High SystemRelevancePositiveRelevanceFeedbackIncrease weight of a systemGoal:

  • Calculation of RSV in MIMORWeighted sum of single RSV

    RSVij

    Retrieval Status Value of

    System i for Document j

    _1028030619.unknown

  • Participation in CLEF2002: GIRT trackLinguistic pre-processing: LuceneBasic IR systems: irf2003: multilingual-4 trackLinguistic pre-processing: SnowballBasic IR systems: Lucene and MySQLInternet translation services

  • Participation in CLEF 2003Snowball for linguistic pre-processingMIMORin JAVALucene and MySQLstatic optimizationInternet translation servicesFusion" of translation services: use all terms found by any serviceBlind relevance feedback

  • Blind Relevance FeedbackWe experimented with: Kullback-Leibler (KL) divergence measure Robertson selection value (RSV)

  • Blind Relevance FeedbackBlind Relevance Feedback improved performance over 10% in all testsParameters could not be optimizedSubmitted runs use top five documents and extract 20 terms (5 /20) based on Kullback-Leibler

  • Sequence of operations

  • merged result sets

    Lucene

    MySql

  • Test Runs

  • Test RunsLucene performs much better than MySQL Still, MySQL contributes to the fusion result-> High weight for Lucene

  • Results of submitted runs

  • Submitted RunsLucene performs much better than MySQL MySQL does not contribute to the fusion resultLucene only best runBlind Relevance Feedback leads to improvement

  • Proper NamesProper Names in CLEF Topics lead to better performance of the systems (see Poster and Paper)Treat Topics with Proper Names differently (different system parameters) ?

  • AcknowledgementsThanks to the developers of Lucene and MySQLThanks to students in Hildesheim for implemtening part of MIMOR in their course work

  • OutlookWe plan tointegrate another retrieval systemsinvest more effort in the optimization of the blind relevance feedbackparticipate again in the multilingual trackEvaluate the effect of individual relevance feedback