Other IR Models
description
Transcript of Other IR Models
![Page 1: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/1.jpg)
Other IR Models
Non-Overlapping ListsProximal Nodes
Structured Models
Retrieval: Adhoc Filtering
Browsing
U s e r
T a s k
Classic Models
boolean vector probabilistic
Set Theoretic
Fuzzy Extended Boolean
Probabilistic
Inference Network Belief Network
Algebraic
Generalized Vector Lat. Semantic Index Neural Networks
Browsing
Flat Structure Guided Hypertext
![Page 2: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/2.jpg)
Another Vector Model: Motivation
1. Index terms have synonyms. [Use thesauri?]
2. Index terms have multiple meanings (polysemy).[Use restricted vocabularies or more precise queries?]
3. Index terms are not independent; think “phrases”. [Use combinations of terms?]
![Page 3: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/3.jpg)
Latent Semantic Indexing/Analysis
Basic Idea: Keywords in a query are just one way of specifying the information need. One really wants to specify the key concepts rather than words.
Assume a latent semantic structure underlying the term-document data that is partially obscured by exact word choice.
![Page 4: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/4.jpg)
LSI In Brief Map from terms into lower
dimensional space (via SVD) to remove “noise” and force clustering of similar words. Pre-process corpus to create reduced
vector space Match queries to docs in reduced
space
![Page 5: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/5.jpg)
SVD for Term-Doc Matrix Docs
Term
s
t x d
=
t x m
m x m m x d
C 0T 0S
0D=
where m is the rank of X (<=min(t,d)), T is orthonornal matrix of eigenvectors for term-term correlation,D is orthonornal matrix of eigenvectors from transpose of doc-doc correlation
![Page 6: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/6.jpg)
Reducing Dimensionality Order singular values in S0 by size,
keep the k largest, and delete other rows/columns in S0, T0 and D0 to form
Approximate model is the rank-k model with best possible least-squares-fit to X.
Pick k large enough to fit structure, but small enough to eliminate noise – usually ~100-300.
C
![Page 7: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/7.jpg)
Computing Similarities in LSI
How similar are 2 terms? dot product between two row vectors of
How similar are two documents? dot product between two column vectors
of How similar are a term and a
document? value of an individual cell
C
C
![Page 8: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/8.jpg)
Query Retrieval As before, treat query as short
document: make it column 0 of C First row of C provides rank of docs
wrt query.
![Page 9: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/9.jpg)
LSI Issues Requires access to corpus to compute
SVD How to efficiently compute for Web?
What is the right value of k ? Can LSI be used for cross-language
retrieval? Size of corpus is limited: “one
student’s reading through high school” (Landauer 2002).
![Page 10: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/10.jpg)
Other Vector Model: Neural Network Basic idea:
3 layer neural net: query terms, document terms, documents
Signal propagation based on classic similarity computation
Tune weights.
![Page 11: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/11.jpg)
Neural Network Diagram
from Wilkinson and Hingston, SIGIR 1991
Document Terms
Query Terms
Document
s
ka
kb
kc
ka
kb
kc
k1
kt
d1
dj
dj+1
dN
![Page 12: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/12.jpg)
Computing Document Rank Weight from query to document
term WiqWiq = wiq
sqrt ( i wiq ) Weight from document term to
document WijWij = wij
sqrt ( i wij )
![Page 13: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/13.jpg)
Probabilistic ModelsPrinciple: Given a user query q and a
document d in the collection, estimate the probability that the user will find d relevant. (How?)
User rates a retrieved subset. System uses rating to refine the subset. Over time, retrieved subset should
converge on relevant set.
![Page 14: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/14.jpg)
Computing Similarity I
probability that document dj is relevant to query q, probability that dj is non-relevant to the query q, probability of randomly selecting dj from set R probability that a randomly selected document is
relevant
)|(
)|(),(
)()|(
)()|(
)|(
)|(),(
RdP
RdPqdsim
RPRdP
RPRdP
dRP
dRPqdsim
j
j
j
j
j
j
j
j
)|( jdRP
)|( jdRP
)|( RdP j
)(RP
![Page 15: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/15.jpg)
Computing Similarity II
probability that index term ki is present in document randomly selected from R,
Assumes independence of index terms
)|(1
)|(log
)|(1
)|(log),( ,
1, RkP
RkP
RkP
RkPwwqdsim
i
i
i
iji
t
iqij
)|( RkP i
![Page 16: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/16.jpg)
Initializing Probabilities assume constant probabilities for
index terms: assume distribution of index terms
in non-relevant documents matches overall distribution:
5.0)|( RkP i
N
dfRkP i
i )|(
![Page 17: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/17.jpg)
Improving ProbabilitiesAssumptions: approximate probability given relevant
as % docs with index i retrieved so far:
approximate probabilities given non-relevant by assuming not retrieved are non-relevant:
V
VRkP i
i )|(
VN
VdfRkP ii
i
)|(
![Page 18: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/18.jpg)
Classic Probabilistic Model Summary Pros:
ranking based on assessed probability can be approximated without user
intervention Cons:
really need user to determine set V ignores term frequency assumes independence of terms
![Page 19: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/19.jpg)
Probabilistic Alternative: Bayesian (Belief) NetworksA graphical structure to represent the
dependence between variables in which the following holds:
1. a set of random variables for the nodes2. a set of directed links3. a conditional probability table for each
node, indicating relationship with parents4. a directed acyclic graph
![Page 20: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/20.jpg)
Belief Network Example
B E P(A)
T T .95
T F .94
F T .29
F F .001
Burglary Earthquake
Alarm
JohnCalls Mary Calls
P(B)
.001
P(E)
.002
A P(J)
T .90
F .05
A P(M)
T .70
F .01
from Russell & Norvig
![Page 21: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/21.jpg)
Belief Network Example (cont.)
B E P(A)
T T .95
T F .94
F T .29
F F .001
P(B)
.001
P(E)
.002
A P(J)
T .90
F .05
A P(M)
T .70
F .01
00062.998.999.001.7.9.
)(~)(~)~|~()|()|(
)~~(
EPBPEBAPAMPAJP
EBAMJP
Probability of false notification: alarm sounded and both people call, but there was no burglary or earthquake
Burglary Earthquake
Alarm
JohnCalls Mary Calls
![Page 22: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/22.jpg)
Inference Networks for IRRandom variables
are associated with documents, index terms and queries.
Edges from document node to term nodes increases belief in terms.
and
or
qq2
q1
k1 k2 ki kt
I
dj
or
… …
![Page 23: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/23.jpg)
Computing rank in Inference Networks for IR
and
or
qq2
q1
k1 k2 ki kt
I
dj
or
… …
q is keyword query. q1 is Boolean query. I is information need.
Rank of document is computed as P(q^dj)
k jjj dPdkPkqPdqP )()|()|()(
![Page 24: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/24.jpg)
Where do probabilities come from? (Boolean Model)
uniform priors on documents
only terms in the document are active
query is matched to keywords ala Boolean model
otherwise 0
))()(,()(| if 1)|(
otherwise 0
1)( if 1)|(
1)(
cciiidnfcccc
ji
ji
j
qgkgkqqqkqP
dgdkP
NdP
![Page 25: Other IR Models](https://reader035.fdocuments.in/reader035/viewer/2022062422/56813a3e550346895da22916/html5/thumbnails/25.jpg)
Belief Network Formulation different network topology does not consider each document
individually adopts set theoretic view