Pseudo-relevance Feedback & Passage Retrieval
description
Transcript of Pseudo-relevance Feedback & Passage Retrieval
![Page 1: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/1.jpg)
Pseudo-relevance Feedback
& Passage RetrievalLing573
NLP Systems & ApplicationsApril 16, 2013
![Page 2: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/2.jpg)
Major Issue in RetrievalAll approaches operate on term matching
If a synonym, rather than original term, is used, approach can fail
![Page 3: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/3.jpg)
Major IssueAll approaches operate on term matching
If a synonym, rather than original term, is used, approach can fail
Develop more robust techniquesMatch “concept” rather than term
![Page 4: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/4.jpg)
Major IssueAll approaches operate on term matching
If a synonym, rather than original term, is used, approach can fail
Develop more robust techniquesMatch “concept” rather than term
Mapping techniques Associate terms to concepts
Aspect models, stemming
![Page 5: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/5.jpg)
Major IssueAll approaches operate on term matching
If a synonym, rather than original term, is used, approach can fail
Develop more robust techniquesMatch “concept” rather than term
Mapping techniques Associate terms to concepts
Aspect models, stemmingExpansion approaches
Add in related terms to enhance matching
![Page 6: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/6.jpg)
Compression TechniquesReduce surface term variation to concepts
![Page 7: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/7.jpg)
Compression TechniquesReduce surface term variation to conceptsStemming
![Page 8: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/8.jpg)
Compression TechniquesReduce surface term variation to conceptsStemming
Aspect modelsMatrix representations typically very sparse
![Page 9: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/9.jpg)
Compression TechniquesReduce surface term variation to conceptsStemming
Aspect modelsMatrix representations typically very sparseReduce dimensionality to small # key aspects
Mapping contextually similar terms togetherLatent semantic analysis
![Page 10: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/10.jpg)
Expansion TechniquesCan apply to query or document
![Page 11: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/11.jpg)
Expansion TechniquesCan apply to query or documentThesaurus expansion
Use linguistic resource – thesaurus, WordNet – to add synonyms/related terms
![Page 12: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/12.jpg)
Expansion TechniquesCan apply to query or documentThesaurus expansion
Use linguistic resource – thesaurus, WordNet – to add synonyms/related terms
Feedback expansionAdd terms that “should have appeared”
![Page 13: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/13.jpg)
Expansion TechniquesCan apply to query or documentThesaurus expansion
Use linguistic resource – thesaurus, WordNet – to add synonyms/related terms
Feedback expansionAdd terms that “should have appeared”
User interaction Direct or relevance feedback
Automatic pseudo relevance feedback
![Page 14: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/14.jpg)
Query RefinementTypical queries very short, ambiguous
Cat: animal/Unix command
![Page 15: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/15.jpg)
Query RefinementTypical queries very short, ambiguous
Cat: animal/Unix command Add more terms to disambiguate, improve
Relevance feedback
![Page 16: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/16.jpg)
Query RefinementTypical queries very short, ambiguous
Cat: animal/Unix command Add more terms to disambiguate, improve
Relevance feedback Retrieve with original queries Present results
Ask user to tag relevant/non-relevant
![Page 17: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/17.jpg)
Query RefinementTypical queries very short, ambiguous
Cat: animal/Unix command Add more terms to disambiguate, improve
Relevance feedback Retrieve with original queries Present results
Ask user to tag relevant/non-relevant “push” toward relevant vectors, away from non-relevant
Vector intuition: Add vectors from relevant documents Subtract vector from non-relevant documents
![Page 18: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/18.jpg)
Relevance FeedbackRocchio expansion formula
β+γ=1 (0.75,0.25);Amount of ‘push’ in either direction
R: # rel docs, S: # non-rel docs r: relevant document vectors s: non-relevant document vectors
Can significantly improve (though tricky to evaluate)
![Page 19: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/19.jpg)
Collection-based Query Expansion
Xu & Croft 97 (classic)Thesaurus expansion problematic:
Often ineffective Issues:
![Page 20: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/20.jpg)
Collection-based Query Expansion
Xu & Croft 97 (classic)Thesaurus expansion problematic:
Often ineffective Issues:
Coverage: Many words – esp. NEs – missing from WordNet
![Page 21: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/21.jpg)
Collection-based Query Expansion
Xu & Croft 97 (classic)Thesaurus expansion problematic:
Often ineffective Issues:
Coverage: Many words – esp. NEs – missing from WordNet
Domain mismatch: Fixed resources ‘general’ or derived from some domain May not match current search collection
Cat/dog vs cat/more/ls
![Page 22: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/22.jpg)
Collection-based Query Expansion
Xu & Croft 97 (classic)Thesaurus expansion problematic:
Often ineffective Issues:
Coverage: Many words – esp. NEs – missing from WordNet
Domain mismatch: Fixed resources ‘general’ or derived from some domain May not match current search collection
Cat/dog vs cat/more/lsUse collection-based evidence: global or local
![Page 23: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/23.jpg)
Global AnalysisIdentifies word cooccurrence in whole collection
Applied to expand current queryContext can differentiate/group concepts
![Page 24: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/24.jpg)
Global AnalysisIdentifies word cooccurrence in whole collection
Applied to expand current queryContext can differentiate/group concepts
Create index of concepts:Concepts = noun phrases (1-3 nouns long)
![Page 25: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/25.jpg)
Global AnalysisIdentifies word cooccurrence in whole collection
Applied to expand current queryContext can differentiate/group concepts
Create index of concepts:Concepts = noun phrases (1-3 nouns long)Representation: Context
Words in fixed length window, 1-3 sentences
![Page 26: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/26.jpg)
Global AnalysisIdentifies word cooccurrence in whole collection
Applied to expand current queryContext can differentiate/group concepts
Create index of concepts:Concepts = noun phrases (1-3 nouns long)Representation: Context
Words in fixed length window, 1-3 sentencesConcept identifies context word documents
Use query to retrieve 30 highest ranked conceptsAdd to query
![Page 27: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/27.jpg)
Local AnalysisAka local feedback, pseudo-relevance feedback
![Page 28: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/28.jpg)
Local AnalysisAka local feedback, pseudo-relevance feedbackUse query to retrieve documents
Select informative terms from highly ranked documents
Add those terms to query
![Page 29: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/29.jpg)
Local AnalysisAka local feedback, pseudo-relevance feedbackUse query to retrieve documents
Select informative terms from highly ranked documents
Add those terms to querySpecifically,
Add 50 most frequent terms,10 most frequent ‘phrases’ – bigrams w/o stopwordsReweight terms
![Page 30: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/30.jpg)
Local Context AnalysisMixes two previous approaches
Use query to retrieve top n passages (300 words)Select top m ranked concepts (noun sequences)Add to query and reweight
![Page 31: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/31.jpg)
Local Context AnalysisMixes two previous approaches
Use query to retrieve top n passages (300 words)Select top m ranked concepts (noun sequences)Add to query and reweight
Relatively efficientApplies local search constraints
![Page 32: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/32.jpg)
Experimental ContrastsImprovements over baseline:
Local Context Analysis: +23.5% (relative)Local Analysis: +20.5%Global Analysis: +7.8%
![Page 33: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/33.jpg)
Experimental ContrastsImprovements over baseline:
Local Context Analysis: +23.5% (relative)Local Analysis: +20.5%Global Analysis: +7.8%
LCA is best and most stable across data setsBetter term selection that global analysis
![Page 34: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/34.jpg)
Experimental ContrastsImprovements over baseline:
Local Context Analysis: +23.5% (relative)Local Analysis: +20.5%Global Analysis: +7.8%
LCA is best and most stable across data setsBetter term selection that global analysis
All approaches have fairly high varianceHelp some queries, hurt others
![Page 35: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/35.jpg)
Experimental Contrasts Improvements over baseline:
Local Context Analysis: +23.5% (relative) Local Analysis: +20.5% Global Analysis: +7.8%
LCA is best and most stable across data sets Better term selection that global analysis
All approaches have fairly high variance Help some queries, hurt others
Also sensitive to # terms added, # documents
![Page 36: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/36.jpg)
Global Analysis
Local Analysis
LCA
What are the different techniques used to create self-induced hypnosis?
![Page 37: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/37.jpg)
Passage RetrievalDocuments: wrong unit for QA
Highly ranked documentsHigh weight terms in common with queryNot enough!
![Page 38: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/38.jpg)
Passage RetrievalDocuments: wrong unit for QA
Highly ranked documentsHigh weight terms in common with queryNot enough!
Matching terms scattered across document Vs Matching terms concentrated in short span of document
Solution:From ranked doc list, select and rerank shorter
spansPassage retrieval
![Page 39: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/39.jpg)
Passage RankingGoal: Select passages most likely to contain
answerFactors in reranking:
![Page 40: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/40.jpg)
Passage RankingGoal: Select passages most likely to contain
answerFactors in reranking:
Document rank
![Page 41: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/41.jpg)
Passage RankingGoal: Select passages most likely to contain
answerFactors in reranking:
Document rankWant answers!
![Page 42: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/42.jpg)
Passage RankingGoal: Select passages most likely to contain
answerFactors in reranking:
Document rankWant answers!
Answer type matching Restricted Named Entity Recognition
![Page 43: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/43.jpg)
Passage RankingGoal: Select passages most likely to contain
answerFactors in reranking:
Document rankWant answers!
Answer type matching Restricted Named Entity Recognition
Question match:Question term overlapSpan overlap: N-gram, longest common sub-spanQuery term density: short spans w/more qterms
![Page 44: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/44.jpg)
Quantitative Evaluation of Passage Retrieval for QA
Tellex et al.Compare alternative passage ranking
approaches8 different strategies + voting ranker
Assess interaction with document retrieval
![Page 45: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/45.jpg)
Comparative IR SystemsPRISE
Developed at NISTVector Space retrieval systemOptimized weighting scheme
![Page 46: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/46.jpg)
Comparative IR SystemsPRISE
Developed at NISTVector Space retrieval systemOptimized weighting scheme
LuceneBoolean + Vector Space retrievalResults Boolean retrieval RANKED by tf-idf
Little control over hit list
![Page 47: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/47.jpg)
Comparative IR SystemsPRISE
Developed at NISTVector Space retrieval systemOptimized weighting scheme
LuceneBoolean + Vector Space retrievalResults Boolean retrieval RANKED by tf-idf
Little control over hit list
Oracle: NIST-provided list of relevant documents
![Page 48: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/48.jpg)
Comparing Passage Retrieval
Eight different systems used in QAUnitsFactors
![Page 49: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/49.jpg)
Comparing Passage Retrieval
Eight different systems used in QAUnitsFactors
MITRE:Simplest reasonable approach: baselineUnit: sentenceFactor: Term overlap count
![Page 50: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/50.jpg)
Comparing Passage Retrieval
Eight different systems used in QAUnitsFactors
MITRE:Simplest reasonable approach: baselineUnit: sentenceFactor: Term overlap count
MITRE+stemming:Factor: stemmed term overlap
![Page 51: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/51.jpg)
Comparing Passage Retrieval
Okapi bm25Unit: fixed width sliding windowFactor:
k1=2.0; b=0.75
![Page 52: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/52.jpg)
Comparing Passage Retrieval
Okapi bm25Unit: fixed width sliding windowFactor:
k1=2.0; b=0.75
MultiText:Unit: Window starting and ending with query termFactor:
Sum of IDFs of matching query termsLength based measure * Number of matching terms
![Page 53: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/53.jpg)
Comparing Passage Retrieval
IBM:Fixed passage lengthSum of:
Matching words measure: Sum of idfs of overlap terms
Thesaurus match measure: Sum of idfs of question wds with synonyms in document
Mis-match words measure: Sum of idfs of questions wds NOT in document
Dispersion measure: # words b/t matching query terms
Cluster word measure: longest common substring
![Page 54: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/54.jpg)
Comparing Passage Retrieval
SiteQ:Unit: n (=3) sentencesFactor: Match words by literal, stem, or WordNet
synSum of
Sum of idfs of matched terms Density weight score * overlap count, where
![Page 55: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/55.jpg)
Comparing Passage Retrieval
SiteQ:Unit: n (=3) sentencesFactor: Match words by literal, stem, or WordNet
synSum of
Sum of idfs of matched terms Density weight score * overlap count, where
![Page 56: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/56.jpg)
Comparing Passage Retrieval
Alicante:Unit: n (= 6) sentencesFactor: non-length normalized cosine similarity
![Page 57: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/57.jpg)
Comparing Passage Retrieval
Alicante:Unit: n (= 6) sentencesFactor: non-length normalized cosine similarity
ISI:Unit: sentenceFactors: weighted sum of
Proper name match, query term match, stemmed match
![Page 58: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/58.jpg)
ExperimentsRetrieval:
PRISE:Query: Verbatim quesiton
Lucene: Query: Conjunctive boolean query (stopped)
![Page 59: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/59.jpg)
ExperimentsRetrieval:
PRISE:Query: Verbatim quesiton
Lucene: Query: Conjunctive boolean query (stopped)
Passage retrieval: 1000 word passagesUses top 200 retrieved docsFind best passage in each docReturn up to 20 passages
Ignores original doc rank, retrieval score
![Page 60: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/60.jpg)
Pattern MatchingLitkowski pattern files:
Derived from NIST relevance judgments on systems
Format:Qid answer_pattern doc_list
Passage where answer_pattern matches is correct If it appears in one of the documents in the list
![Page 61: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/61.jpg)
Pattern MatchingLitkowski pattern files:
Derived from NIST relevance judgments on systems
Format:Qid answer_pattern doc_list
Passage where answer_pattern matches is correct If it appears in one of the documents in the list
MRR scoringStrict: Matching pattern in official documentLenient: Matching pattern
![Page 62: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/62.jpg)
ExamplesExample
Patterns1894 (190|249|416|440)(\s|\-)million(\s|\-)miles?
APW19980705.0043 NYT19990923.0315 NYT19990923.0365 NYT20000131.0402 NYT19981212.0029
1894 700-million-kilometer APW19980705.0043 1894 416 - million - mile NYT19981211.0308
Ranked list of answer passages1894 0 APW19980601.0000 the casta way weas1894 0 APW19980601.0000 440 million miles 1894 0 APW19980705.0043 440 million miles
![Page 63: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/63.jpg)
Evaluation MRR
Strict: Matching pattern in official documentLenient: Matching pattern
Percentage of questions with NO correct answers
![Page 64: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/64.jpg)
Evaluation MRR
Strict: Matching pattern in official documentLenient: Matching pattern
Percentage of questions with NO correct answers
![Page 65: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/65.jpg)
Evaluation on Oracle Docs
![Page 66: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/66.jpg)
OverallPRISE:
Higher recall, more correct answers
![Page 67: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/67.jpg)
OverallPRISE:
Higher recall, more correct answersLucene:
Higher precision, fewer correct, but higher MRR
![Page 68: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/68.jpg)
OverallPRISE:
Higher recall, more correct answersLucene:
Higher precision, fewer correct, but higher MRRBest systems:
IBM, ISI, SiteQRelatively insensitive to retrieval engine
![Page 69: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/69.jpg)
AnalysisRetrieval:
Boolean systems (e.g. Lucene) competitive, good MRRBoolean systems usually worse on ad-hoc
![Page 70: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/70.jpg)
AnalysisRetrieval:
Boolean systems (e.g. Lucene) competitive, good MRRBoolean systems usually worse on ad-hoc
Passage retrieval:Significant differences for PRISE, OracleNot significant for Lucene -> boost recall
![Page 71: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/71.jpg)
AnalysisRetrieval:
Boolean systems (e.g. Lucene) competitive, good MRRBoolean systems usually worse on ad-hoc
Passage retrieval:Significant differences for PRISE, OracleNot significant for Lucene -> boost recall
Techniques: Density-based scoring improvesVariants: proper name exact, cluster, density score
![Page 72: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/72.jpg)
Error Analysis‘What is an ulcer?’
![Page 73: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/73.jpg)
Error Analysis‘What is an ulcer?’
After stopping -> ‘ulcer’Match doesn’t help
![Page 74: Pseudo-relevance Feedback & Passage Retrieval](https://reader036.fdocuments.in/reader036/viewer/2022062315/56816172550346895dd0fdae/html5/thumbnails/74.jpg)
Error Analysis‘What is an ulcer?’
After stopping -> ‘ulcer’Match doesn’t helpNeed question type!!
Missing relations ‘What is the highest dam?’
Passages match ‘highest’ and ‘dam’ – but not together
Include syntax?