A Non-parametric Bayesian Approach [WSDM’14]

1
A Non-parametric Bayesian Approach [WSDM’14] In this work, we study the problem of user modeling in the search log data and propose a generative model, dpRank, within a non-parametric Bayesian framework. By postulating generative assumptions about a user's search behaviors, dpRank identifies each individual user's latent search interests and his/her distinct result preferences in a joint manner. Experimental results on a large-scale news search log data set validate the effectiveness of the proposed approach, which not only provides in- depth understanding of a user's search intents but also benefits a variety of personalized applications. Methods User Modeling in Search Engine Logs Hongning Wang, Advisort: ChengXiang Zhai, Department of Computer Science, University of Illinois at Urbana-Champaign Urbana, IL 61801 USA {wang296,czhai}@Illinois.edu Group 1 f 1 f 2 f 1 Group k f 2 …… …… f 1 Group c f 2 Group 1 A fully generative model for exploring users’ search behaviors 1. Draw latent user groups from DP: 2. Draw group membership for each user from DP: 3.2 Draw query q i for user u accordingly: 3.3 Draw click preferences for q i accordingly: 3.1 Draw a latent user group c: 3. To generate a query in user u: Dirichlet Process Prior Latent User Groups Modeling of result preferences ( ) =1 ( , ) 1 2 3 ( 1 ) Group 1 Group 2 Group k ( 1 ) Group 2 Group k ( 2 ) Group 1 Group 2 Group k ( 3 ) q 1 q 2 q 3 p(Q) ( ¿ ¿ 1 , 1 2 , 1 ) ¿ p(Q) q 1 q 2 q 3 ( ¿ ¿ , 2 , ) ¿ p(Q) q 1 q 2 q 3 ( ¿ ¿ , 2 , ) ¿ ( 0 , 0 2 ) 2 ( 0 , 0 ) ( 0 , 0 2 ) Modeling of search interest ( ) ( , 2 ) Gibbs sampling for posterior inference ( 0 , 0 2 ) 2 ( 0 , 0 ) ( 0 , 0 2 ) ( ) =1 ( , ) ( ) ( , 2 ) ( | ) = > 1 1 + exp ( ( )) Experimental Results Grou p Top Ranked Queries 10 today in history, nascar 2011 schedule, today history, this day in history 9 miami heat, los angeles lakers, liverpool football club, arsenal football, nfl lockout 8 los angeles lakers, arsenal football, the dark knight rises, transformers 3, manchester united 7 the titanic, the bachelorette, cars 2, hangover 2, the voice 6 tree of life, game of thrones, sonic the hedgehog, world of warcraft, mtv awards 2011 5 casey anthony trial, casey anthony jurors, casey anthony, crude oil prices, air france flight 447 4 joplin missing, apple icloud, sony hackers, google subpoena, ford transmission 3 fake tupac story, pbs hackers, alaska earthquake, southwest pilot, arizona wildfires 0 2 4 6 8 10 12 14 0 1 2 3 4 5 6 7 8 9 10 Feature ID Group ID 0.8 0.6 0.4 0.2 0 0.2 0.4 document age query match in title proximity in title site authority Query distribution in latent user groups Click preferences in latent user groups Global model Document ranking Yahoo! News search logs May to July, 2011 65 ranking features for each Query- Document pair ( , ) = 1 ¿ ¿ ( ( ) = | ) ( ) ¿ Aggregated level: information shared by all the users Individual level: characterize user’s own interest MAP P@1 P@3 MRR URSVM 0.487 0.298 0.220 0.501 GRSVM 0.616 0.446 0.283 0.632 TRSVM 0.622 0.459 0.283 0.638 IRSVM 0.617 0.449 0.281 0.632 dpRank 0.642 0.485 0.290 0.658 A Ranking Model Adaptation Approach [SIGIR’13] In this work, we propose a general ranking model adaptation framework for personalized search. The proposed framework quickly learns to apply a series of linear transformations, e.g., scaling and shifting, over the parameters of the given global ranking model such that the adapted model can better fit each individual user's search preferences. Extensive experimentation based on a large set of search logs from a major commercial Web search engine confirms the effectiveness of the proposed method compared to several state-of-the- art ranking model adaptation methods. Methods Timestamp Query Clicks 5/29/2012 14:06:04 coney island Cincinnati 5/30/2012 12:12:04 drive direction to coney island 5/31/2012 19:40:38 motel 6 locations 5/31/2012 19:45:04 Cincinnati hotels near coney island y x Adjust the generic ranking model’s parameters with respect to each individual user’s ranking preferences y x ( ) = ( ) = ( ~ ) ( 2 ) = ( ( 1 ) 0 0 ( 2 ) (1 ) (2 ) 0 ( ) ( 1 ) ) ( ) Linear regression based model adaptation min ( ) = ( ; ) + ( ) h ( ) = ( ~ ) ~ =( , 1 ) Lose function from any linear learning-to-rank algorithm, e.g., RankNet, LambdaRank, RankSVM Complexity of adaptation Induced optimization problem in the same complexity as the original problem Instantiation of RankSVM min , 1 2 | w | 2 + C , . . Δ 1 ,∀ , , 0 h > Δ = max [ 1 ( ) ] 1 2 [ 1 ( , ) + 2 ( , ) ] . . 0 ,∀ h 1 ( , ) = ( ( )= )( ( )= ) 1 ( , ) = 1 ( ( ) = )( ( )= ) Pairwise ranking model Margin rescaling Non-linear kernels Experimental Results # Users # Queries # Documents Annotation Set - 49,782 2,320,711 User Set 34,827 187,484 1,744,969 Bing query log: May 27, 2012 – May 31, 2012 1830 ranking features User-level improvement against global model User Class % Population Method ΔMAP ΔP@1 ΔP@3 ΔMRR Heavy 6.8 RA 0.1843 0.3309 0.0120 0.183 2 Cross 0.1998 0.3523 0.0182 0.199 4 Medium 14.9 RA 0.1102 0.2129 0.0025 0.110 3 Cross 0.1494 0.2561 0.0208 0.150 0 Light 78.3 RA 0.0042 0.0575 - 0.0221 0.004 1 Cross 0.0403 * 0.0894 * - 0.0021 0.040 6* [10, ∞) queries [5, 10) queries (0, 5) queries Use cross-training to determine feature grouping per-user basis adaptation baseline * Indicates p-value<0.01 Adaptation efficiency Query-level improvement against global model

description

User Modeling in Search Engine Logs. Hongning Wang, Advisort : ChengXiang Zhai, Department of Computer Science, University of Illinois at Urbana-Champaign Urbana, IL 61801 USA { wang296,czhai}@Illinois.edu. Margin rescaling. p(Q). per-user basis adaptation baseline. p(Q). p(Q). - PowerPoint PPT Presentation

Transcript of A Non-parametric Bayesian Approach [WSDM’14]

Page 1: A Non-parametric Bayesian Approach [WSDM’14]

A Non-parametric Bayesian Approach [WSDM’14]In this work, we study the problem of user modeling in the search log data and propose a generative model, dpRank, within a non-parametric Bayesian framework. By postulating generative assumptions about a user's search behaviors, dpRank identifies each individual user's latent search interests and his/her distinct result preferences in a joint manner. Experimental results on a large-scale news search log data set validate the effectiveness of the proposed approach, which not only provides in-depth understanding of a user's search intents but also benefits a variety of personalized applications.

Methods

User Modeling in Search Engine LogsHongning Wang, Advisort: ChengXiang Zhai,

Department of Computer Science, University of Illinois at Urbana-Champaign Urbana, IL 61801 USA {wang296,czhai}@Illinois.edu

Group 1

f1

f2

f1

Group k f2

…… ……f1

Group c f2

Group1

A fully generative model for exploring users’ search behaviors 1. Draw latent user groups from DP:

2. Draw group membership for each user from DP:

3.2 Draw query qi for user u accordingly:3.3 Draw click preferences for qi accordingly:

3.1 Draw a latent user group c:3. To generate a query in user u:

Dirichlet Process Prior

Latent User Groups

Modeling of result preferences(𝜋𝑘 )𝑘=1∞ 𝐷𝑃 (𝛾 ,𝜂)

𝜋 1𝜋 2𝜋 3 𝜋𝑒

(1−𝑏) 𝜋𝑒𝑏𝜋𝑒

Group1

Group2

Groupk

𝜋 (𝑢1 )

Group2

Groupk

𝜋 (𝑢2 )

…Group1

Group2

Groupk

𝜋 (𝑢3 )

q1

q2 q

3

p(Q)(𝜇¿¿1 ,𝜎1

2 , 𝛽1)¿p(Q)

q1

q2 q

3

(𝜇¿¿𝑘 ,𝜎𝑘2 , 𝛽𝑘)¿

p(Q)

q1

q2 q

3

(𝜇¿¿𝑐 ,𝜎𝑐2 , 𝛽𝑐 )¿

𝜇𝑘𝑡 𝑁 (𝜇0 ,𝜎02)𝜎 𝑘𝑡

2 𝐺𝑎𝑚𝑚𝑎 (𝛼0 , 𝛽0)𝛽𝑘𝑣 𝑁 (0 ,𝑎02)

Modeling of search interest𝑝 (𝑞𝑖 ) 𝑁 (𝜇𝑖 ,𝜎𝑘

2 𝐼 )

Gibbs sampling for posterior inference

𝜇𝑘𝑡 𝑁 (𝜇0 ,𝜎02) 𝜎 𝑘𝑡

2 𝐺𝑎𝑚𝑚𝑎 (𝛼0 , 𝛽0) 𝛽𝑘𝑣 𝑁 (0 ,𝑎02)

(𝜋𝑘 )𝑘=1∞ 𝐷𝑃 (𝛾 ,𝜂)

𝑐 𝑖 𝜋𝑢𝑝 (𝑞𝑖 ) 𝑁 (𝜇𝑘 ,𝜎𝑘

2 𝐼 )

𝑝 (𝐷𝑖|𝑞𝑖)= ∏𝑦 𝑖 𝑠>𝑦 𝑖 𝑡

11+exp (− 𝛽𝑘

𝑡 (𝑑𝑖𝑠−𝑑𝑖𝑡))

Experimental Results

Group Top Ranked Queries10 today in history, nascar 2011 schedule, today history, this day in history

9 miami heat, los angeles lakers, liverpool football club, arsenal football, nfl lockout

8 los angeles lakers, arsenal football, the dark knight rises, transformers 3, manchester united

7 the titanic, the bachelorette, cars 2, hangover 2, the voice

6 tree of life, game of thrones, sonic the hedgehog, world of warcraft, mtv awards 2011

5 casey anthony trial, casey anthony jurors, casey anthony, crude oil prices, air france flight 447

4 joplin missing, apple icloud, sony hackers, google subpoena, ford transmission

3 fake tupac story, pbs hackers, alaska earthquake, southwest pilot, arizona wildfires

2 selena gomez, lady gaga, britney spears, jennifer aniston, taylor swift

1 iran, china, libya, vietnam, Syria 0 2 4 6 8 10 12 140123456789

10

Feature ID

Group

ID

�0.8

�0.6

�0.4

�0.2

0

0.2

0.4document agequery match in title

proximity in titlesite authority

• Query distribution in latent user groups • Click preferences in latent user groups

Global model

• Document ranking•

• Yahoo! News search logs• May to July, 2011• 65 ranking features for each Query-Document pair

𝑠 (𝑑 𝑗𝑡 ,𝑞 𝑗 )=1

¿𝑆∨¿∑𝑠∈𝑆

∑𝑘𝑝 (𝑐 (𝑠 )=𝑘|𝑞 𝑗 )𝛽𝑘

( 𝑠 )𝑑 𝑗𝑡 ¿

Aggregated level: information shared by all the users

Individual level: characterize user’s own interest

MAP P@1 P@3 MRRURSVM 0.487 0.298 0.220 0.501GRSVM 0.616 0.446 0.283 0.632TRSVM 0.622 0.459 0.283 0.638IRSVM 0.617 0.449 0.281 0.632dpRank 0.642 0.485 0.290 0.658

A Ranking Model Adaptation Approach [SIGIR’13]In this work, we propose a general ranking model adaptation framework for personalized search. The proposed framework quickly learns to apply a series of linear transformations, e.g., scaling and shifting, over the parameters of the given global ranking model such that the adapted model can better fit each individual user's search preferences. Extensive experimentation based on a large set of search logs from a major commercial Web search engine confirms the effectiveness of the proposed method compared to several state-of-the-art ranking model adaptation methods.

Methods

Timestamp Query Clicks5/29/2012 14:06:04 coney island Cincinnati5/30/2012 12:12:04 drive direction to coney island5/31/2012 19:40:38 motel 6 locations5/31/2012 19:45:04 Cincinnati hotels near coney island

y

x

• Adjust the generic ranking model’s parameters with respect to each individual user’s ranking preferences

y

x𝑓 (𝑥 )=𝑤𝑇 𝑥𝑓 𝑢 (𝑥 )=(𝐴𝑢~𝑤𝑠 )𝑇 𝑥

𝑂 (𝑉 2)

𝐴𝑢=(𝑎𝑔 (1 )𝑢 00 𝑎𝑔 (2)

𝑢⋯ 𝑏𝑔 (1 )

𝑢

⋯ 𝑎𝑔 ( 2)𝑢

⋮ ⋮0 ⋯

⋱ ⋮𝑎𝑔 (𝑉 )𝑢 𝑏𝑔 (1)

𝑢 )𝑂 (𝑉 )

• Linear regression based model adaptation

min𝐴 𝑢

𝐿𝑎𝑑𝑎𝑝𝑡 (𝐴𝑢)=𝐿 (𝑄𝑢 ; 𝑓 𝑢)+𝜆𝑅( 𝐴𝑢)

h𝑤 𝑒𝑟𝑒 𝑓 𝑢 (𝑥 )=(𝐴𝑢~𝑤𝑠 )𝑇 𝑥 𝑎𝑛𝑑~𝑤𝑠=(𝑤𝑠 ,1)

Lose function from any linear learning-to-rank algorithm, e.g., RankNet, LambdaRank, RankSVM

Complexity of adaptation

Induced optimization problem in the same complexity as the original problem

• Instantiation of RankSVM

min𝑤 ,𝜉 𝑖𝑗𝑙

12|w|2+C∑

𝑞𝑖

∑𝑗 , 𝑙𝜉 𝑖𝑗𝑙

𝑠 .𝑡 .𝑤𝑇 Δ𝑥 𝑖𝑗𝑙≥1−𝜉 𝑖𝑗𝑙 ,∀𝑞𝑖 ,𝑥 𝑖𝑗 ,𝑥 𝑖𝑙

𝜉 𝑖𝑗𝑙≥0h𝑤 𝑒𝑟𝑒 𝑦 𝑖𝑗>𝑦 𝑖𝑙𝑎𝑛𝑑 Δ𝑥 𝑖𝑗𝑙=𝑥𝑖𝑗−𝑥 𝑗𝑙

max𝛼

∑𝑡

[1− 𝑓 𝑠 (𝑥𝑡 ) ]𝛼𝑡−12 𝛼

𝑇 [𝐾1 (�⃗� , �⃗� )+𝐾 2 (�⃗� , �⃗� ) ]𝛼𝑠 .𝑡 .0≤𝛼 𝑡≤𝐶 ,∀𝑡

h𝑤 𝑒𝑟𝑒𝐾 1 ( �⃗�𝑡 , �⃗�𝑟 )=∑𝑘 ( ∑

𝑔 (𝑣 )=𝑘𝑤𝑣

𝑠 �⃗�𝑡𝑣 )( ∑𝑔 (𝑣 )=𝑘

𝑤𝑣𝑠 �⃗�𝑟 𝑣)

𝐾 1 ( �⃗�𝑡 , �⃗�𝑟 )= 1𝜎∑

𝑘 ( ∑𝑔 (𝑣 )=𝑘

�⃗�𝑡𝑣 )( ∑𝑔 (𝑣 )=𝑘

�⃗�𝑟 𝑣 )

Pairwise ranking modelMargin rescaling

Non-linear kernels

Experimental Results

# Users # Queries # Documents

Annotation Set - 49,782 2,320,711

User Set 34,827 187,484 1,744,969

• Bing query log: May 27, 2012 – May 31, 2012• 1830 ranking features

• User-level improvement against global model

User Class % Population Method ΔMAP ΔP@1 ΔP@3 ΔMRR

Heavy 6.8RA 0.1843 0.3309 0.0120 0.1832

Cross 0.1998 0.3523 0.0182 0.1994

Medium 14.9RA 0.1102 0.2129 0.0025 0.1103

Cross 0.1494 0.2561 0.0208 0.1500

Light 78.3RA 0.0042 0.0575 -0.0221 0.0041

Cross 0.0403* 0.0894* -0.0021 0.0406*

[10, ∞) queries

[5, 10) queries

(0, 5) queries

Use cross-training to determine feature grouping

per-user basis adaptation baseline

* Indicates p-value<0.01

• Adaptation efficiency

• Query-level improvement against global model