Post on 03-Aug-2021
Robust Query ExpansionInternship Closing Talk
Joshua V Dillon1 Kevyn Collins-Thompson2
1Georgia Institute of Technology, Atlanta, Georgia
2Microsoft Research, Redmond, Washington
August 11, 2009
Note: This �le can opened in Adobe Illustrator for high resolution use.
Introduction Objective Experiments The Problem The Approach
What is query expansion? . . .Who is John Galt?!
User submits the query term John Galt
Standard retrieval: documents without John Galt but with
Dagny Taggart will not be retrieved
Query expansion: query is augmented with related terms e.g.,
Atlas Shrugged and Ayn Rand , then those documents are
retrieved
Reduces query/document vocabulary mismatch by expanding the queryusing words or phrases with “similar meaning.”
And its a BIG deal!
Large upside potential
Correct alteration: 6+ NDCG gain (oracle)Query expansion research: 10-15% MAP gain
Many diverse approaches: alteration, expansion, reduction (longqueries)
Josh Dillon Robust Query Expansion 2
Introduction Objective Experiments The Problem The Approach
What is query expansion? . . .Who is John Galt?!
User submits the query term John Galt
Standard retrieval: documents without John Galt but with
Dagny Taggart will not be retrieved
Query expansion: query is augmented with related terms e.g.,
Atlas Shrugged and Ayn Rand , then those documents are
retrieved
Reduces query/document vocabulary mismatch by expanding the queryusing words or phrases with “similar meaning.”
And its a BIG deal!
Large upside potential
Correct alteration: 6+ NDCG gain (oracle)Query expansion research: 10-15% MAP gain
Many diverse approaches: alteration, expansion, reduction (longqueries)
Josh Dillon Robust Query Expansion 2
Introduction Objective Experiments The Problem The Approach
What is query expansion? . . .Who is John Galt?!
User submits the query term John Galt
Standard retrieval: documents without John Galt but with
Dagny Taggart will not be retrieved
Query expansion: query is augmented with related terms e.g.,
Atlas Shrugged and Ayn Rand , then those documents are
retrieved
Reduces query/document vocabulary mismatch by expanding the queryusing words or phrases with “similar meaning.”
And its a BIG deal!
Large upside potential
Correct alteration: 6+ NDCG gain (oracle)Query expansion research: 10-15% MAP gain
Many diverse approaches: alteration, expansion, reduction (longqueries)
Josh Dillon Robust Query Expansion 2
Introduction Objective Experiments The Problem The Approach
What is query expansion? . . .Who is John Galt?!
User submits the query term John Galt
Standard retrieval: documents without John Galt but with
Dagny Taggart will not be retrieved
Query expansion: query is augmented with related terms e.g.,
Atlas Shrugged and Ayn Rand , then those documents are
retrieved
Reduces query/document vocabulary mismatch by expanding the queryusing words or phrases with “similar meaning.”
And its a BIG deal!
Large upside potential
Correct alteration: 6+ NDCG gain (oracle)Query expansion research: 10-15% MAP gain
Many diverse approaches: alteration, expansion, reduction (longqueries)
Josh Dillon Robust Query Expansion 2
Introduction Objective Experiments The Problem The Approach
What is query expansion? . . .Who is John Galt?!
User submits the query term John Galt
Standard retrieval: documents without John Galt but with
Dagny Taggart will not be retrieved
Query expansion: query is augmented with related terms e.g.,
Atlas Shrugged and Ayn Rand , then those documents are
retrieved
Reduces query/document vocabulary mismatch by expanding the queryusing words or phrases with “similar meaning.”
And its a BIG deal!
Large upside potential
Correct alteration: 6+ NDCG gain (oracle)Query expansion research: 10-15% MAP gain
Many diverse approaches: alteration, expansion, reduction (longqueries)
Josh Dillon Robust Query Expansion 2
Introduction Objective Experiments The Problem The Approach
Robust? Risk? Reward? Hogwash!
State-of-the art query expansion methods perform well on average buthave limited real-world deployment.
Risky : large variance across queries & optimal parameter settings
Increasingly complex decision environments
Personalization, implicit/explicit relevance, computation budget, . . .
Need a framework for principled, selective query model estimationcapable of handling diverse constraints. . .
Josh Dillon Robust Query Expansion 3
Introduction Objective Experiments The Problem The Approach
Robust? Risk? Reward? Hogwash!
State-of-the art query expansion methods perform well on average buthave limited real-world deployment.
Risky : large variance across queries & optimal parameter settings
Increasingly complex decision environments
Personalization, implicit/explicit relevance, computation budget, . . .
Need a framework for principled, selective query model estimationcapable of handling diverse constraints. . .
Josh Dillon Robust Query Expansion 3
Introduction Objective Experiments The Problem The Approach
Robust? Risk? Reward? Hogwash!
State-of-the art query expansion methods perform well on average buthave limited real-world deployment.
Risky : large variance across queries & optimal parameter settings
Increasingly complex decision environments
Personalization, implicit/explicit relevance, computation budget, . . .
Need a framework for principled, selective query model estimationcapable of handling diverse constraints. . .
Josh Dillon Robust Query Expansion 3
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Existing work:
Self-tuning methods, [Tao/Zhai, SIGIR ’06]
Non-convex, Expectation-MaximizationExpands relevant words “into” top-k documentsPicks relevant documents for fixed terms
Risk-aware methods, [Collins-Thompson, NIPS ’08]
Casts risk/reward as quadratic program with linear constraintsDomain knowledge: aspect balance/coverage, query support, . . .Picks (possibly zero) terms but has no notion of documents
My Contribution:
Model parameter space under large-scale computing environment
Improve results by employing translation model while providing amore theoretically motivated risk model
Unified framework which elegantly combines advantages of bothself-tuning and risk-aware methods
Josh Dillon Robust Query Expansion 4
Introduction Objective Experiments The Problem The Approach
Definition
A query expansion is measured by the relative improvement it providesover no expansion, for a bounded positive performance measure s(q),viz., Is(q, q) = 1− s(q)/s(q).
Definition
We use mean average precision (MAP) as query performance measures(q), viz.,
s(q) = |rel (q, C)|−1N∑
k=1
P(q, k)δ(dk ∈ rel (q, C))
P(q, k) = k−1|rel (q,Fk(q))|, rel (q, C) = {documents in C relevant to q}
So, performance under a MAP criterion emphasizes returning morerelevant documents higher in rank. Other measures include P5, P20, . . .
Josh Dillon Robust Query Expansion 5
Introduction Objective Experiments The Problem The Approach
Definition
A query expansion is measured by the relative improvement it providesover no expansion, for a bounded positive performance measure s(q),viz., Is(q, q) = 1− s(q)/s(q).
Definition
We use mean average precision (MAP) as query performance measures(q), viz.,
s(q) = |rel (q, C)|−1N∑
k=1
P(q, k)δ(dk ∈ rel (q, C))
P(q, k) = k−1|rel (q,Fk(q))|, rel (q, C) = {documents in C relevant to q}
So, performance under a MAP criterion emphasizes returning morerelevant documents higher in rank. Other measures include P5, P20, . . .
Josh Dillon Robust Query Expansion 5
Introduction Objective Experiments The Problem The Approach
Definition
A query expansion is measured by the relative improvement it providesover no expansion, for a bounded positive performance measure s(q),viz., Is(q, q) = 1− s(q)/s(q).
Definition
We use mean average precision (MAP) as query performance measures(q), viz.,
s(q) = |rel (q, C)|−1N∑
k=1
P(q, k)δ(dk ∈ rel (q, C))
P(q, k) = k−1|rel (q,Fk(q))|, rel (q, C) = {documents in C relevant to q}
So, performance under a MAP criterion emphasizes returning morerelevant documents higher in rank. Other measures include P5, P20, . . .
Josh Dillon Robust Query Expansion 5
Introduction Objective Experiments The Problem The Approach
Definition
Risk represents the extent of downside loss in relative improvement, viz.,R(q, q) = −P(Is(q, q) ≤ 0)E [Is(q, q)|I (q, q)) ≤ 0].
Definition
Conversely, reward represents the extent of upside gain in relativeimprovement, viz., V (q, q) = P(Is(q, q) > 0)E [Is(q, q)|I (q, q)) > 0].
Making, E [Is(q, q)] = V (q, q)− R(q, q) the overall expected relativeimprovement.
Josh Dillon Robust Query Expansion 6
Introduction Objective Experiments The Problem The Approach
Definition
Risk represents the extent of downside loss in relative improvement, viz.,R(q, q) = −P(Is(q, q) ≤ 0)E [Is(q, q)|I (q, q)) ≤ 0].
Definition
Conversely, reward represents the extent of upside gain in relativeimprovement, viz., V (q, q) = P(Is(q, q) > 0)E [Is(q, q)|I (q, q)) > 0].
Making, E [Is(q, q)] = V (q, q)− R(q, q) the overall expected relativeimprovement.
Josh Dillon Robust Query Expansion 6
Introduction Objective Experiments The Problem The Approach
Definition
Risk represents the extent of downside loss in relative improvement, viz.,R(q, q) = −P(Is(q, q) ≤ 0)E [Is(q, q)|I (q, q)) ≤ 0].
Definition
Conversely, reward represents the extent of upside gain in relativeimprovement, viz., V (q, q) = P(Is(q, q) > 0)E [Is(q, q)|I (q, q)) > 0].
Making, E [Is(q, q)] = V (q, q)− R(q, q) the overall expected relativeimprovement.
Josh Dillon Robust Query Expansion 6
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Wait a minute . . .
In some sense, current approaches actually do address the risk/rewardtradeoff by interpolating the original query with the expanded query, ie,
Naıve Risk/Reward Tradeoff
q′ = λq + (1− λ)q, λ ∈ [0, 1] (1)
Can we improve this tradeoff?
Josh Dillon Robust Query Expansion 7
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
To account for uncertainty of a given expanded query q, we employ thequadratic program,
Robust Risk/Reward Tradeoff
arg minx∈X
J (x) = −xTµ+κ
2xTΣx (2)
where,
x = [xR ; xR ], xR = [P(R1), . . . ,P(Rm)]T, xR = 1− xR
X encodes our domain knowledge
µi represents our expected belief in relevance of term i ,
and Σij the risk of terms i , j
This objective is the robust counterpart of a linear program withellipsoidal uncertainty set and is theoretically motivated by Ben-Tal &Nemirovski, OR Letters ’99.
Josh Dillon Robust Query Expansion 8
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
We find µi as,
µi = E [Ri |q, α, β]
= P(Ri |α)δ(wi ∈ q) + P(Ri |β)δ(wi /∈ q) (3)
with,
P(Ri |α) , α + (1− α)P(Ri |wi )
P(Ri |β) , βP(Ri |wi )
using P(Ri |wi ) = P(wi |Ri )
P(wi |Ri )+P(wi |Ri )and assuming P(Ri ) = P(Ri ) = 1/2.
Hence µi is cast as a function of P(wi |Ri ), P(wi |Ri ), which we obtainfrom a query expansion algorithm, as follows.
Josh Dillon Robust Query Expansion 9
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Ponte/Lavrenko Relevance Model
Standard query expansion of the Lemur toolkit
Works surprisingly well in practice (when it works, that is. . .)1 P(w) ≈ |C|−1 P
d∈C tf (w , d)2 P(w |d) ≈ tf (w , d)3 Return words and relevance,
P(Ri |q) ∝X
d∈Fk (q)
e−s′(d)P(wi |d),
as sorted by P(wi |Fk(q))/P(wi ).
Josh Dillon Robust Query Expansion 10
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Tao/Zhai Relevance Model
Use EM to estimate a mixture of word (non-)relevance multinomials,regularized by the original query.
Interesting twist #1: gradually relax the affect of the query as a prior
Interesting twist #2: quit after expected relevance reaches a certainthreshold.
Goal: eliminate interpolation as θR should be the interpolated queryexpansion. Such interpolation, we can suppose, will be smootherthan the naıve tradeoff.
Josh Dillon Robust Query Expansion 11
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
A bit more detail. . .
1 E-step:
P(Zw ,d) = αdP(w |θR)/ (αdP(w |θR) + (1− αd)P(w |θN))
2 M-step:
αd =∑w∈V
P(Zw ,d)tf (w , d)
P(w |θR) =µP(w |θq) +
∑d∈Fk (q) c(w , d)P(Zw ,d)
µ+∑
w∈V
∑d∈Fk (q) c(w , d)P(Zw ,d)
µ = δµ
3 quit when expected relevance is greater than µ
. . . you’re feeling sleeeeepy, so sleeeeeeepy
Josh Dillon Robust Query Expansion 12
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
A bit more detail. . .
1 E-step:
P(Zw ,d) = αdP(w |θR)/ (αdP(w |θR) + (1− αd)P(w |θN))
2 M-step:
αd =∑w∈V
P(Zw ,d)tf (w , d)
P(w |θR) =µP(w |θq) +
∑d∈Fk (q) c(w , d)P(Zw ,d)
µ+∑
w∈V
∑d∈Fk (q) c(w , d)P(Zw ,d)
µ = δµ
3 quit when expected relevance is greater than µ
. . . you’re feeling sleeeeepy, so sleeeeeeepy
Josh Dillon Robust Query Expansion 12
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Recall our objective,
arg minx∈X
−xTµ+κ
2xTΣx
We construct Σ as a super-matrix, viz,
Σ =
[Σ1 00 Σ2
](4)
We now examine 2× 2 approaches for estimating Σ and the motivationbehind each.
Josh Dillon Robust Query Expansion 13
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
On one hand, we can interpret Σ1,Σ2 as intrinsic term-term uncertainty,possibly suggesting Σ1 , Σ2 , ΣR .
Alternatively, we could posit the uncertainty set varies for relevant andnon-relevant terms, ie, Σ1 , ΣR , Σ2 , ΣR .
In both cases our source of relevance information comes from the top-k(feedback) documents for a given query, denoted Fk(q). Thenon-relevant uncertainty ΣR could estimated from the bottom-kdocuments or a secondary dataset.
Josh Dillon Robust Query Expansion 14
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Constructing Σ: jac
Smoothed Jaccard similarity heuristic (previous work).
Jaccard similarity coefficient
Measures similarity between sample sets (no longer treating documents
as multisets) and is defined as, J(A,B) = |A∩B||A∪B|
Dijexp∝ Jij (5)
Sij = γ exp
{− 1
σ2Dij
}(6)
Σij =
{||S(i , q)||p, i = j
S(i , j), i 6= j(7)
Use “dilated” Jaccard coefficient to quantify word-word similarity
Set diagonal elements of Σ to “distance from query”
Josh Dillon Robust Query Expansion 15
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Constructing Σ: hco
Heat kernel-based stochastic translation of word co-occurrencedistributions (new work).
1 Estimate word coocurrence distributions
2 Compute normalized graph Laplacian of geodesic distances between[above]
3 Compute expected word-word distance under this translation
Σij =
{expected (under translation) word-query distance, i = j
expected (under translation) word-word distance, i 6= j(8)
Josh Dillon Robust Query Expansion 16
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Estimating Tij = P(wi → wj) [hco, 1 of 6]
General approach: diffusion kernel Kt(qu, qv ) on graph (V ,E ) whosenodes are distributions that correspond to words
V : each vertex is a contextual distribution qv (w) = P(w |v)corresponding to a word v
E : graph edge weights are the Fisher diffusion kernel on multinomialsimplex
T is from diffusion kernel on (V ,E )
Josh Dillon Robust Query Expansion 17
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Estimating Tij = P(wi → wj) [hco, 1 of 6]
General approach: diffusion kernel Kt(qu, qv ) on graph (V ,E ) whosenodes are distributions that correspond to words
V : each vertex is a contextual distribution qv (w) = P(w |v)corresponding to a word v
qv (w) ∝∑
d
tf (w , d)tf (v , d)
E : graph edge weights are the Fisher diffusion kernel on multinomialsimplex
T is from diffusion kernel on (V ,E )
Josh Dillon Robust Query Expansion 17
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Estimating Tij = P(wi → wj) [hco, 1 of 6]
General approach: diffusion kernel Kt(qu, qv ) on graph (V ,E ) whosenodes are distributions that correspond to words
V : each vertex is a contextual distribution qv (w) = P(w |v)corresponding to a word v
E : graph edge weights are the Fisher diffusion kernel on multinomialsimplex
e(u, v) = exp
(− 1
σ2arccos2
(∑w
√qu(w)qv (w)
))
T is from diffusion kernel on (V ,E )
Josh Dillon Robust Query Expansion 17
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Estimating Tij = P(wi → wj) [hco, 1 of 6]
General approach: diffusion kernel Kt(qu, qv ) on graph (V ,E ) whosenodes are distributions that correspond to words
V : each vertex is a contextual distribution qv (w) = P(w |v)corresponding to a word v
E : graph edge weights are the Fisher diffusion kernel on multinomialsimplex
T is from diffusion kernel on (V ,E )
T ∝ exp(−tL)
where L is the normalized Laplacian
t controls the amount of translationlimt→0
T = I and limt→∞
T = stationary
Josh Dillon Robust Query Expansion 17
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Expected Distance [hco, 2 of 6]
Two words x ,w stochastically translate into words y , z and arerepresented by unit vectors θmle
y = 1y and θmlez = 1z .
Distance d(θmley , θmle
z ) is a random variable, summarized by itsexpectation (given in closed form), ie.,
Ep(y|x)p(z|w)‖θmley − θmle
z ‖22 = N−2
1
N1Xi=1
Xj∈{1,...,N1}\{i}
(TT>)xi ,xj
+ N−22
N2Xi=1
Xj∈{1,...,N2}\{i}
(TT>)wi ,wj
− 2N−11 N−2
2
N1Xi=1
N2Xj=1
(TT>)xi ,wj + N−11 + N−1
2 .
Note : obviously this formula is more general than needed as in our caseN1 = N2 = 1.
Josh Dillon Robust Query Expansion 18
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Example, Simplex [hco, 3 of 6]
qGalt
qDagny
qMicrosoft
Josh Dillon Robust Query Expansion 19
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Example, Simplex [hco, 3 of 6]
qGalt
qDagny
qMicrosoft
Josh Dillon Robust Query Expansion 19
Example, expected distances near “german” [hco, 4 of 6]
0
1
2
3
x 10−4
stey
r
wal
ther
luge
rsu
bmac
hine
pist
ols
brow
ning
reco
il
bolt
carb
ines
naga
nt
gara
nd
pist
ol
mos
in
revo
lver
s
shot
guns
enfie
ld
muz
zle
arm
s
1938
carb
ine
Terms Near ’german’
Example, expected distances far from “german” [hco, 5 of 6]
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
anfa
aufs
a
etze
dich
tung
hera
usge
gebe
n
gege
nwar
t
hrsg
liter
atur
enge
n
gebu
rtsta
g
fest
schr
ift
ege
gesc
hich
te
stud
ien
eber
ww
l
ww
jd
ww
j
cam
arillo
cros
man
Terms Far From ’german’
Introduction Objective Experiments Reward (Relevance Model) Risk (Uncertainty Model)
Large Deviation Interpretation [hco, 6 of 6]
By the Chernoff-Stein lemma, KL-divergence is the best exponent in theprobability of type II error (and bounded type I error), i.e.,
βoptn ≈ exp(−γnD(qu||qv )).
Examining the Taylor series expansion of KL-divergence for nearby qu, qv ,one also finds that for the Fisher geodesic distance, d(p, q),
d2(qu, qv ) ≈ 2D(qu||qv ).
Thus one may interpret the heat kernel translation model as being basedon a graph whose edge weights approximate the optimal error ratebetween a test of Q = qu vs. Q = qv .
Josh Dillon Robust Query Expansion 22
Introduction Objective Experiments Results
Game-plan:
Compiled MatlabMicrosoft Computing Resources
+ Hyperparameter Sweepkajabillions of embarrassingly parallel experiments
Reality:
Devil’s in the details. . .
[Sad Seattle Josh]
Josh Dillon Robust Query Expansion 23
Introduction Objective Experiments Results
Game-plan:
Compiled MatlabMicrosoft Computing Resources
+ Hyperparameter Sweepkajabillions of embarrassingly parallel experiments
Reality:
Devil’s in the details. . .
[Sad Seattle Josh]
Josh Dillon Robust Query Expansion 23
Robust Tao/Zhai, hco of query: “1938 german mauser”
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
k98
acp
blue
d ge
rman
m
ause
r 19
38
web
ley
bere
tta
stey
r or
dnan
ce
wal
ther
pi
stol
m
ause
rs
shot
guns
rif
les
suhl
p3
8 ca
rcan
o ho
lste
r pi
stol
s ba
yone
ts
cod3
m
osin
en
field
rif
le
alye
a na
gant
ba
yone
t bo
lt ca
rbin
e lu
ger
carb
ines
m
annl
iche
r 98
k zb
rojo
vka
muz
zle
gren
ade
sten
w
affe
n ga
rand
br
owni
ng
scab
bard
ar
ms
revo
lver
s re
coil
calib
er
subm
achi
ne
abbr
evia
tion
ww
ii de
utsc
hen
Term Relevance for ’1938 german mauser’
Robust Ponte/Lavrenko, hco of query: “1938 german mauser”
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
98k
k98
snip
er
germ
an
1938
m
ause
r 19
40
1943
19
44
1939
19
41
1945
br
itish
w
ar
1937
19
42
1935
19
11
1934
ar
my
wer
ke
ww
i m
annl
iche
r 19
36
stam
ped
germ
any
mar
ked
rifle
s pi
stol
gr
enad
e m
agaz
ine
artil
lery
w
affe
n pr
oduc
tion ii
carb
ine
barre
l rif
le
milit
ary
infa
ntry
pi
stol
s be
retta
m
achi
ne
wal
ther
sh
otgu
n gr
ips
ww
ii p3
8 bo
lt lu
ger
Introduction Objective Experiments Results
Contributions/Closing Remarks
Employed heat kernel-based stochastic translation as a risk modelfor query expansion
Presented initial results for a term and document aware risk/rewardquery expansion model
Conducted initial analysis of hyperparameter space to isolate keyparameter interactions
Built large-scale Matlab experiment test-bed using MS ComputingResources
Continue to formulate a more “elegant” unification of the Tao/Zhairelevance model directly into the optimization objective
Thanks!
Josh Dillon Robust Query Expansion 31
Introduction Objective Experiments Results
Contributions/Closing Remarks
Employed heat kernel-based stochastic translation as a risk modelfor query expansion
Presented initial results for a term and document aware risk/rewardquery expansion model
Conducted initial analysis of hyperparameter space to isolate keyparameter interactions
Built large-scale Matlab experiment test-bed using MS ComputingResources
Continue to formulate a more “elegant” unification of the Tao/Zhairelevance model directly into the optimization objective
Thanks!
Josh Dillon Robust Query Expansion 31
Introduction Objective Experiments Results
Related Work:
Kevyn Collins-Thompson, NIPS 2008
Aharon Ben-Tal & Arkadi Nemirovski, OR Letters 1999
Victor Lavrenko, James Allan, SIGIR 2005
Tao Tao, ChengXiang Zhai, SIGIR 2005
Joshua V Dillon, et. al., UAI 2007
Josh Dillon Robust Query Expansion 32
Introduction Objective Experiments Results
[This slide intentionally blank.]
Josh Dillon Robust Query Expansion 33