structlm-labtalk

8/2/2019 structlm-labtalk

1/24

Incorporating Language Modelinginto the Inference Network

Retrieval Framework

Don Metzler


2/24

Motivation

Great deal of information lost when forming queries Example: stemming information retrieval

InQuery informal (tf.idfobservation estimates) structured queries via inference network framework

Language Modeling formal (probabilistic model of documents)

unstructured InQuery + Language modeling

formal

structured


3/24

Motivation

Simple idea:

Replace tf.idfestimates in inference network

framework with language modeling estimates

Result is a system based on ideas from language

modeling that allows powerful structured queries

Overall goal:

Do as well as, or better than, InQuery within thismore formal framework


4/24

Outline

Review

Inference Network Framework

Language Modeling Combined Approach

Results


5/24

Review of Inference Networks

Directed acyclic graph

Compactly represents joint probabilitydistribution over a set of continuous and/ordiscrete random variables

Each node has a conditional probability tableassociated with it

Network topology defines conditionalindependence assumptions among nodes

In general, inference is NP-hard


6/24

Inference Network Framework

Node types

document (di)

concept (ri)

query (qi)

information need (I)

Set evidence at

document nodes

Run belief propagation

Documents are scored

by P(I= true | di= true)


7/24

Network Semantics

All events in network are binary

Events associated with each node:

di document iis observedri representation concept iis observed

qi query representation iis observed

I information need is satisfied


8/24

Query Language


9/24

Example Query

Unstructured:stemming information retrieval

Structured:#wand(1.5 #syn(#phrase(information retrieval) IR)

2.0 stemming)


10/24

Want to compute bel(n) for each node n in

the network (bel(n) = P(n = true | di= true))

Term/proximity node beliefs (InQuery)

Belief Propagation

1||log

5.0||log

||

||

5.15.0

)1()(

,

,

,

,

,

!

!

!

C

tf

C

idf

D

d

tf

tftf

idftfdbdbrbel

i

i

i

i

i

dr

r

avg

i

dr

dr

dr

rdrdb = default belief

tfr,di= number of times

representation r is matched in

document di

|di| = length of document i

|D|avg= average doc. length

|C|=

collection length


11/24

Belief Nodes In general, marginalization

is very costly

Assuming a nice functional

form, via link matrices,

marginalization becomeseasy

p1, ,pn are the beliefs at

the parent nodes ofq

W= w1 + + wn

!

!

!

!

!

!

!

i

Ww

iwand

i

iand

i

ii

wsum

ii

sum

n

i

ior

not

ipqbel

pqbelW

pw

qbel

n

pqbel

ppqbel

pqbel

pqbel

)/(

1max

1

)(

)(

)(

)(

),,max()(

)1(1)(

1)(

-


12/24

Link MatrixExample

A B

Q


13/24

Language Modeling

Models document generation as a stochastic

process

Assume words are drawn i.i.d. from anunderlying multinomial distribution

Use smoothed maximum likelihood estimate:

Query likelihood model:

!!

Qq

ddn qPqqQP )|()|( 1 UU-

||

)1(

||

)|(,

C

cf

d

tfwP w

dw

d PPU !


14/24

Rather than use tf.idfestimates for bel(r), use

smoothed language modeling estimates:

Use Jelinek-Mercer smoothing throughout forsimplicity

Inference Network + LM

||)1(

||)|(

)|()(

,

C

cf

d

tfdrP

drPrbel

r

i

dr

i

i

iPP !

!


15/24

Combining Evidence

InQuery combines query evidence via #wsum

operator i.e. all queries are of the form #wsum( )

#wsum does not work for combined model

resulting scoring function lacks idfcomponent

Must use #wand instead

Can be interpreted as normalized weighted

averages

arithmetic (InQuery)

geometric (combined model)


16/24

#wsumvs. #wand


17/24

Relation to Query Likelihood

Model subsumes query likelihood model

Given a query Q =q1,q2, ,qn (qi is a single

term) convert it to the following structuredquery:

#and(q1 q2 qn)

Result is query likelihood model


18/24

Smoothing

InQuery crude smoothing via default belief

Proximity node smoothing

Single term smoothing Other proximity node smoothing

Each type of proximity node can be

smoothed differently


19/24

Experiments

Data sets TREC 4 ad hoc(manual & automatic queries)

TREC 6, 7, and 8 ad hoc

Comparison

Query likelihood (QL)

InQuery

Combined approach (S

tructLM) Single term node smoothing = 0.6

Other proximity node smoothing = 0.1


20/24

Example Query

Topic: Is there data available tosuggest that capital punishment is a

deterrent to crime?

Manual structured query:#wsum(1.0 #wsum(1.0 capital 1.0 punishment

1.0 deterrent 1.0 crime

2.0 #uw20(capital punishment deterrent)

1.0 #phrase(capital punishment)

1.0 #passage200 (1.0 capital 1.0 punishment

1.0 deterrent 1.0 crime

1.0 #phrase(capital punishment)))


21/24


22/24

Proximity Node Smoothing


23/24

Conclusions

Good structured queries help

Combines inference networks structured

query language with formal languagemodeling probability estimates

Performs competitively against InQuery

Subsumes query likelihood model


24/24

Future Work

Smoothing

Try other smoothing techniques

Find optimal parameters for each node type Combine LM and tf.idfdocument

representations

Other estimates for bel(r)

Theoretical considerations

structlm-labtalk

Documents

Transcript of structlm-labtalk