Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services
description
Transcript of Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services
![Page 1: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/1.jpg)
Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services
Kai Wang, Zhao-Yan Ming, Xia Hu, Tat-Seng ChuaSIGIR’10
Speaker: Hsin-Lan, WangDate: 2011/03/07
![Page 2: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/2.jpg)
Outline Introduction Question Sentence Detection
Sequential Pattern Mining Syntactic Shallow Pattern Mining Model Learning
Multi-Sentence Question Segmentation Building Graphs for Question Threads Propagating the Closeness Scores Segmentation-aided Retrieval
Experiment Conclusion
![Page 3: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/3.jpg)
Introduction
cQA: Community-based Question Answering services
![Page 4: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/4.jpg)
Introduction A new graph based approach to segment
multi-sentence questions would be introduced in this paper.
Basic idea: Detect question sentences Measure the closeness score Model their relationships to form a graph Use the graph to propagate the closeness
scores Group topically related sentences
![Page 5: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/5.jpg)
Question Sentence Detection
Human generated content on the Web are usually informal.
Solve: Use salient sequential and syntactic patterns as features to build a question detector.
![Page 6: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/6.jpg)
Question Sentence Detection
Sequential Pattern Mining Sequential Pattern is also referred to
as Labeled Sequential Pattern.S→C, C is the class label that the sequence S is classified to.
Sequence is defined to be a series of tokens from sentences, and the class is in the binary form of {Q, NQ}.
![Page 7: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/7.jpg)
Question Sentence Detection
Sequential Pattern Mining The purpose is to extract a set of frequen
t subsequence of words that are indicative of questions.
Applying POS taggers to all tokens except some keywords.<any1, know, what>→<any1, VB, what>
![Page 8: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/8.jpg)
Question Sentence Detection
Syntactic Shallow Pattern Mining
![Page 9: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/9.jpg)
Question Sentence Detection
Model Learning Certain patterns from questions
becomes unnatural to identify characteristics for non-questions.
Solve: One-class SVM Training data: assuming all questions
ending with question marks as an initial set of positive examples.
![Page 10: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/10.jpg)
Multi-Sentence Question Segmentation
Building Graphs for Question Threads Vq: question sentence vertex set Vc: context sentence vertex set
Model the question thread into a weighted graph (V,E).
![Page 11: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/11.jpg)
Multi-Sentence Question Segmentation
Building Graphs for Question Threads Directed edge (u→v):
KL-divergence
Coherence
Coreference
![Page 12: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/12.jpg)
Multi-Sentence Question Segmentation
Building Graphs for Question Threads Undirected edge (u-v):
Cosine Similarity
Distance
: proportional to the number of sentences between u and v.
![Page 13: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/13.jpg)
Multi-Sentence Question Segmentation Building Graphs for Question Threads
Undirected edge (u-v): Coherence
Coreference
![Page 14: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/14.jpg)
Multi-Sentence Question Segmentation Propagating the Closeness Scores
![Page 15: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/15.jpg)
Multi-Sentence Question Segmentation Propagating the Closeness Scores
Sort edges in Er by the closeness score. <e1, e2, … , en > Extraction process terminates at em when
one of the following criteria is met:
![Page 16: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/16.jpg)
Multi-Sentence Question Segmentation Propagating the Closeness Scores
Example: final edge set {(q1,c1), (q2,c2), (q1,c2)}
question segments (q1 – c1, c2), (q2 – c2)
![Page 17: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/17.jpg)
Multi-Sentence Question Segmentation Segmentation-aided Retrieval
![Page 18: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/18.jpg)
Experiments Evaluation of Question Detection
Dataset: issued getByCategory API query to Yahoo! Answers.
Generate three datasets: Pattern Mining Set: 350k sentences extracted from 60k
question threads. Training Set: 130k sentences from another 60k questio
n threads. Testing Set: Two annotators are asked to tag 2004 que
stion sentences and 2039 non-question sentences.
![Page 19: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/19.jpg)
Experiments
Evaluation of Question Detection
![Page 20: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/20.jpg)
Experiments
Direct Assessment of Multi-Sentence Question Segmentation via User Study
![Page 21: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/21.jpg)
Experiments
Performance Evaluation on Question Retrieval with Segmentation Model
![Page 22: Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cQA Services](https://reader036.fdocuments.in/reader036/viewer/2022081520/568159d1550346895dc720cf/html5/thumbnails/22.jpg)
Conclusion
Present a new segmentation approach for segmenting multi-sentence questions.
Separates question sentences from non-question sentences and aligns them according to their closeness scores.