Query Intent Detection using Convolutional Neural...

19
Query Intent Detection using Convolutional Neural Networks Homa B. Hashemi, Amir Asiaee, Reiner Kraft QRUMS workshop - February 22, 2016

Transcript of Query Intent Detection using Convolutional Neural...

Page 1: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Query Intent Detection using

Convolutional Neural Networks

Homa B. Hashemi, Amir Asiaee, Reiner Kraft

QRUMS workshop - February 22, 2016

Page 2: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

2 Query Intent Detection using CNN

Query Intent Detection

michelle obama age Person Birth Date Query Intent Detection

Page 3: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

3 Query Intent Detection using CNN

Query Intent Detection

barack obama wife birthday

michelle obama age Person Birth Date Query Intent Detection

Page 4: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

4 Query Intent Detection using CNN

Roadmap

§  Approaches

§  Proposed Method

›  Architecture

›  Convolutional Neural Networks model

§  Experiments

›  Query Classification

›  Query Clustering

Page 5: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

5 Query Intent Detection using CNN

Approaches

Query Intent Detection

Rule-Based

Classification-Based

Feature Engineering

Deep Learning Representations

Ø  Define new rules for a new use case

Ø  Define features

Ø  Automatically embed queries into a vector space

Page 6: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Deep Learning in NLP

6

§  Deep learning achieved state-of-the-art results in computer vision and speech

§  Fast becoming popular in NLP

›  Learning word embeddings (Word2Vec)

›  Learning sentence and document embeddings

•  Semantic parsing (Yih, 2014)

•  Information retrieval (Shen 2014, Huang 2013)

•  Sentiment analysis (RNN and RNTN, Socher et al. 2011, 2012, 2013; Kalchbrenner 2014;Dos Santos 2014; Le 2014; Kim 2014)

Ø  Not much work on Short Text like queries

Ø  Not much work on multi-class classification with high number of classes

Page 7: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Model Architecture

CNN Queries

Query Vectors

New Query

Query Vectors

Classifier

Predicted Intents Classifier CNN

Train Time/Offline

Test Time/Online

7

Page 8: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Query representations with Word2Vec

1)  Get query word vectors 2)  Sum or Average vectors

0.1

2.3

4.5

0.6

7

barack obama wife barack obama wife age age

8

Page 9: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Convolutional Neural Networks

Use pre-trained word vector representations (Google word2vec)

0.1

2.3

4.5

0.6

7

barack obama wife age

Query representation

Word embedding - Query matrix Feature maps

max pooling

Multiple filters

Based on Collober and Weston (2011) and Kim (2014)

ci = f (w.x + b)

c = [c1,c2,...,cn ] c =max{c}

9

Page 10: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Data

10

# of queries 10,000 # of low-level intent classes 125 # of high-level intent classes 14

High-level intent Low-level intent

Movie

Rating

Cast

Length

Release data

Person

Spouse

Birth date

Children

Page 11: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Query intent detection with 125 low-level intent classes

11

75.06

46

39

81.26

56

45

81.57

62

47

33.00

43.00

53.00

63.00

73.00

83.00

Accuracy Avg Precision Avg Recall

Jabba

Sum w2v

Average w2v

Unigram

Unigram+bigram

CNN

Rule-based

§  10,000 queries §  125 intent classes §  10-fold CV §  Random Forest

11

Page 12: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Person and Movie entities

0.4

0.6

0.8

1

Precision Recall Precision Recall Precision Recall Precision Recall Precision Recall

Spouse BirthDate Height Children Birth Location

Rule-based Unigram CNN

0.4

0.6

0.8

1

Precision Recall Precision Recall Precision Recall Precision Recall Precision Recall

Movie Cast Release Date Rating Length Watch Online

12

Page 13: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

13

§  10,000 queries §  14 intent classes §  10-fold CV §  Random Forest

77

45

34

89.9

71

37

90.3

56

43

20

40

60

80

100

Accuracy Precision Recall

Jabba

Sum w2v

Unigram

CNN

Query intent detection with 14 high-level intent classes

Rule-based

Page 14: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Query Clustering

14

§  K-means §  K= 125 §  10,000 query vector representations

cast of annie 2014 big bang cast independence day cast and crew cast of hollow man lee rocker band members santana band members salif keita band members

lisa bonet pictures bleona qereti photos valeen montenegro images jennifer aniston wallpapers hd alma wade pics hitler images

george selvie wiki james brown biography bob toski bio

ninja turtles youtube if i stay full movie viooz watch homeland online the outlander streaming

john candy death ryan knight has died is bruce jenner dead

Page 15: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Conclusion

15

§  Use CNN to get query vector representations for ›  Query Classification

›  Query Clustering

§  Query vector representations ›  Can be used in all the tasks related to query

›  Can be combined with other methods

•  E.g. combining with rule-based outputs and N-grams to get more discriminative features

§  Future work: ›  Exploit query vectors using recurrent neural networks with LSTM cells

Page 16: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations
Page 17: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Detect Emerging intents

17

§  Advantage of CNN over n-grams: Embedding queries into a vector space §  Detect outliers as new set of queries

Page 18: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

Detect Emerging intents

18

§  Advantage of CNN over n-grams: Embedding queries into a vector space §  Detect outliers as new set of queries

Possible new query intent

Page 19: Query Intent Detection using Convolutional Neural Networkspeople.cs.pitt.edu/~hashemi/papers/QRUMS2016_slides.pdf · Convolutional Neural Networks Use pre-trained word vector representations

CNN hyperparameters

§  Window filter sizes = 2, 3, 4

§  Each filter has 100 feature maps

§  Mibi batch size = 50

§  Epochs = 40

§  Nonlinearity = relu

§  Regularization = Dropout ›  Dropout rate = 0.5

›  Randomly set some weights to zero to prevent overfitting

19