DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
-
Upload
maria-eskevich -
Category
Documents
-
view
131 -
download
1
description
Transcript of DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task
DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 1 / 52
DCU at the NTCIR-9 SpokenDocPassage Retrieval Task
Maria Eskevich, Gareth J.F. Jones
Centre for Digital Video ProcessingCentre for Next Generation Localisation
School of ComputingDublin City University
Ireland
December, 8, 2011
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 2 / 52
Outline
Retrieval MethodologyTranscript PreprocessingText SegmentationRetrieval Setup
ResultsOfficial MetricsuMAPpwMAPfMAP
Conclusions
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 3 / 52
Retrieval Methodology
Transcript
TopicallyCoherentSegments
Segmentation
IndexedSegmentsIndexing
Queries
Information Request
RetrievalResultsRetrieval
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 4 / 52
Retrieval Methodology
Transcript
TopicallyCoherentSegments
Segmentation
IndexedSegmentsIndexing
Queries
Information Request
RetrievalResultsRetrieval
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 5 / 52
Retrieval Methodology
Transcript
TopicallyCoherentSegments
Segmentation
IndexedSegmentsIndexing
Queries
Information Request
RetrievalResultsRetrieval
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 6 / 52
Retrieval Methodology
Transcript
TopicallyCoherentSegments
Segmentation
IndexedSegmentsIndexing
Queries
Information Request
RetrievalResultsRetrieval
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 7 / 52
Retrieval Methodology
Transcript
TopicallyCoherentSegments
Segmentation
IndexedSegmentsIndexing
Queries
Information Request
RetrievalResultsRetrieval
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 8 / 52
6 Retrieval Runs
1-best ASR 1-best ASR withstop words removed
ManualTranscript
ASR C99
ASR TT
C99
TT ASR NSW C99
ASR NSW TT
C99
TT Manual C99
Manual TT
C99
TT
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 9 / 52
6 Retrieval Runs
1-best ASR 1-best ASR withstop words removed
ManualTranscript
ASR C99
ASR TT
C99
TT
ASR NSW C99
ASR NSW TT
C99
TT Manual C99
Manual TT
C99
TT
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 10 / 52
6 Retrieval Runs
1-best ASR 1-best ASR withstop words removed
ManualTranscript
ASR C99
ASR TT
C99
TT ASR NSW C99
ASR NSW TT
C99
TT
Manual C99
Manual TT
C99
TT
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 11 / 52
6 Retrieval Runs
1-best ASR 1-best ASR withstop words removed
ManualTranscript
ASR C99
ASR TT
C99
TT ASR NSW C99
ASR NSW TT
C99
TT Manual C99
Manual TT
C99
TT
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 12 / 52
6 Retrieval Runs
1-best ASR 1-best ASR withstop words removed
ManualTranscript
ASR C99
ASR TT
C99
TT
ASR NSW C99
ASR NSW TT
C99
TT
Manual C99
Manual TT
C99
TT
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 13 / 52
Transcript Preprocessing
I Recognize individual morphemes of the sentences:ChaSen 2.4.0, based on Japanese morphological analyzerJUMAN 2.0 with ipadic grammar 2.7.0
I Form the text out of the base forms of the words in order toavoid stemming
I Remove the stop words (SpeedBlog JapaneseStop-words) for one of the runs
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 14 / 52
Transcript Preprocessing
I Recognize individual morphemes of the sentences:ChaSen 2.4.0, based on Japanese morphological analyzerJUMAN 2.0 with ipadic grammar 2.7.0
I Form the text out of the base forms of the words in order toavoid stemming
I Remove the stop words (SpeedBlog JapaneseStop-words) for one of the runs
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 15 / 52
Transcript Preprocessing
I Recognize individual morphemes of the sentences:ChaSen 2.4.0, based on Japanese morphological analyzerJUMAN 2.0 with ipadic grammar 2.7.0
I Form the text out of the base forms of the words in order toavoid stemming
I Remove the stop words (SpeedBlog JapaneseStop-words) for one of the runs
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 16 / 52
Text Segmentation
Use of the algorithms originally developed for text:Individual IPUs are treated as sentences
I TextTiling:I Cosine similarities between adjacent blocks of sentences
I C99:I Compute similarity between sentences using a cosine
similarity measure to form a similarity matrixI Cosine scores are replaced by the rank of the score in the
local regionI Segmentation points are assigned using a clustering
procedure
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 17 / 52
Retrieval SetupSMART information retrieval system extended to use languagemodelling with a uniform document prior probability.
A query q is scored against a document d within the SMARTframework in the following way:
P(q|d) =n∏
i=1
(λiP(qi |d) + (1− λi)P(qi))
whereI q = (q1, . . .qn) is a query comprising of n query terms,I P(qi |d) is the probability of generating the i th query term
from a given document d being estimated by the maximumlikelihood,
I P(qi) is the probability of generating it from the collectionand is estimated by document frequency
The retrieval model used λi = 0.3 for all qi , this valuebeing optimized on the TREC-8 ad hoc dataset.
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 18 / 52
Retrieval SetupSMART information retrieval system extended to use languagemodelling with a uniform document prior probability.A query q is scored against a document d within the SMARTframework in the following way:
P(q|d) =n∏
i=1
(λiP(qi |d) + (1− λi)P(qi))
whereI q = (q1, . . .qn) is a query comprising of n query terms,I P(qi |d) is the probability of generating the i th query term
from a given document d being estimated by the maximumlikelihood,
I P(qi) is the probability of generating it from the collectionand is estimated by document frequency
The retrieval model used λi = 0.3 for all qi , this valuebeing optimized on the TREC-8 ad hoc dataset.
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 19 / 52
Retrieval SetupSMART information retrieval system extended to use languagemodelling with a uniform document prior probability.A query q is scored against a document d within the SMARTframework in the following way:
P(q|d) =n∏
i=1
(λiP(qi |d) + (1− λi)P(qi))
whereI q = (q1, . . .qn) is a query comprising of n query terms,I P(qi |d) is the probability of generating the i th query term
from a given document d being estimated by the maximumlikelihood,
I P(qi) is the probability of generating it from the collectionand is estimated by document frequency
The retrieval model used λi = 0.3 for all qi , this valuebeing optimized on the TREC-8 ad hoc dataset.
Retrieval Methodology DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 20 / 52
Retrieval SetupSMART information retrieval system extended to use languagemodelling with a uniform document prior probability.A query q is scored against a document d within the SMARTframework in the following way:
P(q|d) =n∏
i=1
(λiP(qi |d) + (1− λi)P(qi))
whereI q = (q1, . . .qn) is a query comprising of n query terms,I P(qi |d) is the probability of generating the i th query term
from a given document d being estimated by the maximumlikelihood,
I P(qi) is the probability of generating it from the collectionand is estimated by document frequency
The retrieval model used λi = 0.3 for all qi , this valuebeing optimized on the TREC-8 ad hoc dataset.
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 21 / 52
Outline
Retrieval MethodologyTranscript PreprocessingText SegmentationRetrieval Setup
ResultsOfficial MetricsuMAPpwMAPfMAP
Conclusions
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 22 / 52
Results: Official Metrics
Transcript Segmentation uMAP pwMAP fMAPtype type
BASELINE 0.0670 0.0520 0.0536manual tt 0.0859 0.0429 0.0500manual C99 0.0713 0.0209 0.0168
ASR tt 0.0490 0.0329 0.0308ASR C99 0.0469 0.0166 0.0123
ASR nsw tt 0.0312 0.0141 0.0174ASR nsw C99 0.0316 0.0138 0.0120
I Only runs on the manual transcript had higher scores thanthe baseline (only uMAP metric)
I TextTiling results are consistently higher than C99 for allthe metrics for manual and ASR runs
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 23 / 52
Results: Official Metrics
Transcript Segmentation uMAP pwMAP fMAPtype type
BASELINE 0.0670 0.0520 0.0536manual tt 0.0859 0.0429 0.0500manual C99 0.0713 0.0209 0.0168
ASR tt 0.0490 0.0329 0.0308ASR C99 0.0469 0.0166 0.0123
ASR nsw tt 0.0312 0.0141 0.0174ASR nsw C99 0.0316 0.0138 0.0120
I Only runs on the manual transcript had higher scores thanthe baseline (only uMAP metric)
I TextTiling results are consistently higher than C99 for allthe metrics for manual and ASR runs
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 24 / 52
Results: Official Metrics
Transcript Segmentation uMAP pwMAP fMAPtype type
BASELINE 0.0670 0.0520 0.0536manual tt 0.0859 0.0429 0.0500manual C99 0.0713 0.0209 0.0168
ASR tt 0.0490 0.0329 0.0308ASR C99 0.0469 0.0166 0.0123
ASR nsw tt 0.0312 0.0141 0.0174ASR nsw C99 0.0316 0.0138 0.0120
I Only runs on the manual transcript had higher scores thanthe baseline (only uMAP metric)
I TextTiling results are consistently higher than C99 for allthe metrics for manual and ASR runs
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 25 / 52
Time-based Results Assessment ApproachFor each run and each query:
Passage 1Relevant
Passage 2Relevant
Passage 3
Relevant Passage 4
...
Precision
Precision
Precision
AverageofPrecisionper Queryfor rankswithpassagescontainingrelevantcontent
where:
Precision =Length of the Relevant Part
Length of the Whole Passage
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 26 / 52
Time-based Results Assessment ApproachFor each run and each query:
Passage 1Relevant
Passage 2Relevant
Passage 3
Relevant Passage 4
...
Precision
Precision
Precision
AverageofPrecisionper Queryfor rankswithpassagescontainingrelevantcontent
where:
Precision =Length of the Relevant Part
Length of the Whole Passage
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 27 / 52
Time-based Results Assessment ApproachFor each run and each query:
Passage 1Relevant
Passage 2Relevant
Passage 3
Relevant Passage 4
...
Precision
Precision
Precision
AverageofPrecisionper Queryfor rankswithpassagescontainingrelevantcontent
where:
Precision =Length of the Relevant Part
Length of the Whole Passage
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 28 / 52
Average of Precision for all passages with relevantcontent
I TextTiling algorithm has higher average of precision for alltypes of transcript, i.e.topically coherent segmentsare better located
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 29 / 52
Average of Precision for all passages with relevantcontent
I TextTiling algorithm has higher average of precision for alltypes of transcript, i.e.topically coherent segmentsare better located
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 30 / 52
Results: utterance-based MAP (uMAP)
Transcript type Segmentation type uMAPBASELINE 0.0670
manual tt 0.0859manual C99 0.0713
ASR tt 0.0490ASR C99 0.0469
ASR nsw tt 0.0312ASR nsw C99 0.0316
I The trend ‘manual > ASR > ASR nsw’ for both C99 andTextTiling is not proved by the averages of precision
I Higher average values of the TextTiling segmentation overC99 are not reflected in the uMAP scores
I For some of the queries runs on C99 segmentationhave better ranking of the segments with relevantcontent
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 31 / 52
Results: utterance-based MAP (uMAP)
Transcript type Segmentation type uMAPBASELINE 0.0670
manual tt 0.0859manual C99 0.0713
ASR tt 0.0490ASR C99 0.0469
ASR nsw tt 0.0312ASR nsw C99 0.0316
I The trend ‘manual > ASR > ASR nsw’ for both C99 andTextTiling is not proved by the averages of precision
I Higher average values of the TextTiling segmentation overC99 are not reflected in the uMAP scores
I For some of the queries runs on C99 segmentationhave better ranking of the segments with relevantcontent
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 32 / 52
Results: utterance-based MAP (uMAP)
Transcript type Segmentation type uMAPBASELINE 0.0670
manual tt 0.0859manual C99 0.0713
ASR tt 0.0490ASR C99 0.0469
ASR nsw tt 0.0312ASR nsw C99 0.0316
I The trend ‘manual > ASR > ASR nsw’ for both C99 andTextTiling is not proved by the averages of precision
I Higher average values of the TextTiling segmentation overC99 are not reflected in the uMAP scores
I For some of the queries runs on C99 segmentationhave better ranking of the segments with relevantcontent
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 33 / 52
Results: utterance-based MAP (uMAP)
Transcript type Segmentation type uMAPBASELINE 0.0670
manual tt 0.0859manual C99 0.0713
ASR tt 0.0490ASR C99 0.0469
ASR nsw tt 0.0312ASR nsw C99 0.0316
I The trend ‘manual > ASR > ASR nsw’ for both C99 andTextTiling is not proved by the averages of precision
I Higher average values of the TextTiling segmentation overC99 are not reflected in the uMAP scores
I For some of the queries runs on C99 segmentationhave better ranking of the segments with relevantcontent
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 34 / 52
Relevance of the Central IPU AssessmentNumber of ranks Average of Precision
taken or not taken for the passages at ranksinto account by pwMAP that are taken or not taken
into account by pwMAP
I TextTiling has higher numbers of segments that havecentral IPU relevant to the query
I Overall the numbers of the ranks where the segment withrelevant is retrieved is approximately the samefor both segmentation techniques
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 35 / 52
Relevance of the Central IPU AssessmentNumber of ranks Average of Precision
taken or not taken for the passages at ranksinto account by pwMAP that are taken or not taken
into account by pwMAP
I TextTiling has higher numbers of segments that havecentral IPU relevant to the query
I Overall the numbers of the ranks where the segment withrelevant is retrieved is approximately the samefor both segmentation techniques
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 36 / 52
Relevance of the Central IPU AssessmentNumber of ranks Average of Precision
taken or not taken for the passages at ranksinto account by pwMAP that are taken or not taken
into account by pwMAP
I TextTiling has higher numbers of segments that havecentral IPU relevant to the query
I Overall the numbers of the ranks where the segment withrelevant is retrieved is approximately the samefor both segmentation techniques
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 37 / 52
Results: pointwise MAP (pwMAP)
Transcript type Segmentation type pwMAPBASELINE 0.0520
manual tt 0.0429manual C99 0.0209
ASR tt 0.0329ASR C99 0.0166
ASR nsw tt 0.0141ASR nsw C99 0.0138
I TextTiling segmentation puts better topic boundaries forrelevant content and have higher precision scores for theretrieved relevant passages
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 38 / 52
Results: pointwise MAP (pwMAP)
Transcript type Segmentation type pwMAPBASELINE 0.0520
manual tt 0.0429manual C99 0.0209
ASR tt 0.0329ASR C99 0.0166
ASR nsw tt 0.0141ASR nsw C99 0.0138
I TextTiling segmentation puts better topic boundaries forrelevant content and have higher precision scores for theretrieved relevant passages
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 39 / 52
Average Length of Relevant Part and Segments(in seconds)
Center IPU is relevant
Center IPU is not relevant
I Center IPU is relevant: Average length of the relevantcontent is of the same order for both segmentationschemes, slightly higher for TextTiling
I Center IPU is not relevant: Average length of the relevantcontent is higher for C99 segmentation,due to the poor segmentation it correlateswith much longer segments
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 40 / 52
Average Length of Relevant Part and Segments(in seconds)
Center IPU is relevant
Center IPU is not relevant
I Center IPU is relevant: Average length of the relevantcontent is of the same order for both segmentationschemes, slightly higher for TextTiling
I Center IPU is not relevant: Average length of the relevantcontent is higher for C99 segmentation,due to the poor segmentation it correlateswith much longer segments
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 41 / 52
Average Length of Relevant Part and Segments(in seconds)
Center IPU is relevant
Center IPU is not relevant
I Center IPU is relevant: Average length of the relevantcontent is of the same order for both segmentationschemes, slightly higher for TextTiling
I Center IPU is not relevant: Average length of the relevantcontent is higher for C99 segmentation,due to the poor segmentation it correlateswith much longer segments
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 42 / 52
Average Length of Relevant Part and Segments(in seconds)
Center IPU is relevant
Center IPU is not relevant
I Center IPU is relevant: Average length of the relevantcontent is of the same order for both segmentationschemes, slightly higher for TextTiling
I Center IPU is not relevant: Average length of the relevantcontent is higher for C99 segmentation,due to the poor segmentation it correlateswith much longer segments
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 43 / 52
Average Length of Relevant Part and Segments(in seconds)
Center IPU is relevant Center IPU is not relevant
I Center IPU is relevant: Average length of the relevantcontent is of the same order for both segmentationschemes, slightly higher for TextTiling
I Center IPU is not relevant: Average length of the relevantcontent is higher for C99 segmentation,due to the poor segmentation it correlateswith much longer segments
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 44 / 52
Results: fraction MAP (fMAP)
Transcript type Segmentation type fMAPBASELINE 0.0536
manual tt 0.0500manual C99 0.0168
ASR tt 0.0308ASR C99 0.0123
ASR nsw tt 0.0174ASR nsw C99 0.0120
I Average number of ranks with segments havingnon-relevant center IPU is more than 5 times higher
I Segmentation technique with longer poor segmentedpassages (C99) has much lower precision-based scores
Results DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 45 / 52
Results: fraction MAP (fMAP)
Transcript type Segmentation type fMAPBASELINE 0.0536
manual tt 0.0500manual C99 0.0168
ASR tt 0.0308ASR C99 0.0123
ASR nsw tt 0.0174ASR nsw C99 0.0120
I Average number of ranks with segments havingnon-relevant center IPU is more than 5 times higher
I Segmentation technique with longer poor segmentedpassages (C99) has much lower precision-based scores
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 46 / 52
Outline
Retrieval MethodologyTranscript PreprocessingText SegmentationRetrieval Setup
ResultsOfficial MetricsuMAPpwMAPfMAP
Conclusions
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 47 / 52
Conclusions
I TextTiling segmentation shows better overall retrievalperformance than C99:
I Higher numbers of segments with higher precisionI Higher precision even for the segments with non-relevant
center IPUI High level of poor segmentation makes it harder to retrieve
relevant content for C99 runsI Removal of stop words before segmentation did not have
any positive effect on the results
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 48 / 52
Conclusions
I TextTiling segmentation shows better overall retrievalperformance than C99:
I Higher numbers of segments with higher precision
I Higher precision even for the segments with non-relevantcenter IPU
I High level of poor segmentation makes it harder to retrieverelevant content for C99 runs
I Removal of stop words before segmentation did not haveany positive effect on the results
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 49 / 52
Conclusions
I TextTiling segmentation shows better overall retrievalperformance than C99:
I Higher numbers of segments with higher precisionI Higher precision even for the segments with non-relevant
center IPU
I High level of poor segmentation makes it harder to retrieverelevant content for C99 runs
I Removal of stop words before segmentation did not haveany positive effect on the results
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 50 / 52
Conclusions
I TextTiling segmentation shows better overall retrievalperformance than C99:
I Higher numbers of segments with higher precisionI Higher precision even for the segments with non-relevant
center IPUI High level of poor segmentation makes it harder to retrieve
relevant content for C99 runs
I Removal of stop words before segmentation did not haveany positive effect on the results
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 51 / 52
Conclusions
I TextTiling segmentation shows better overall retrievalperformance than C99:
I Higher numbers of segments with higher precisionI Higher precision even for the segments with non-relevant
center IPUI High level of poor segmentation makes it harder to retrieve
relevant content for C99 runsI Removal of stop words before segmentation did not have
any positive effect on the results
Conclusions DCU at the NTCIR-9 SpokenDoc Passage Retrieval Task 52 / 52
Thank you for your attention!
Questions?