CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on...
-
Upload
rodger-harvey -
Category
Documents
-
view
216 -
download
0
description
Transcript of CLEF 2008 Workshop Aarhus, September 17, 2008 ELDA 1 Overview of QAST 2008 - Question Answering on...
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA1
Overview of QAST 2008
- Question Answering on Speech Transcriptions -
J. Turmo, P. Comas (1), L. Lamel, S. Rosset (2) , N. Moreau, D. Mostefa (3)
(1) UPC, Spain (2) LIMSI, France (3) ELDA, France
QAST Website : http://www.lsi.upc.edu/~qast/
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA2
Outline
1. Objectives2. Description of the tasks3. Participants4. Results5. Future work
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA3
Objectives of QAST 2008
- Development of robust QA for speech transcripts
- Measure loss due to ASR inaccuraciesmanual transcriptions, automatic transcriptions
- Measure loss at different ASR word error rates
- Test with different kinds of speechspontaneous speech, prepared speech
- Development of QA for languages other than EnglishEnglish, French, Spanish
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA4
QAST 2008 Organization
Task jointly organized by :
- UPC, Spain (Coordinator)J. Turmo, P. Comas
- ELDA, FranceN. Moreau, D. Mostefa
- LIMSI-CNRS, FranceS. Rosset, L. Lamel
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA5
Evaluation Data
Corpus Lang. Description Tasks WERCHIL
QAST 2007English Lectures (~25h) T1(a): Manual transcriptions -
T1(b): ASR transcriptions 20%
AMIQAST 2007
English Meetings (~100h) T2(a): Manual transcriptions -
T2(b): ASR transcriptions 38%
ESTER French Broadcast News (~10h)
T3(a): Manual transcriptions -
T3(b): ASR transcriptions 11.9% / 23.9% / 35.4%
EPPS-EN
English Sessions European Parliament (~3h)
T4(a): Manual transcriptions -
T4(b): ASR transcriptions 10.6% / 14.0% / 24.1%
EPPS-ES
Spanish Sessions European Parliament (~3h)
T5(a): manual transcriptions -
T5(b): ASR transcriptions 11.5% / 12.7% / 13.7%
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA6
Development set Evaluation set
Task Data # questions Data # questionsT1 (CHIL, English) 10 seminars 50 15 seminars 100
T2 (AMI, English) 50 meetings 50 118 meetings 100
T3 (ESTER, French) 6 shows 50 12 shows 100
T4 (EPPS, English) 3 sessions 50 3 sessions 100
T5 (EPPS, Spanish) 1 session 50 5 sessions 100
Questions
• Factual questions: ~75%Expected answers = named entities (10 types: person, location, organization, language, system, measure, time, color, shape, material)
• Definition questions: ~25%4 types of answers: person, organization, object, other
• ‘NIL’ questions: ~10%
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA7
• Participants could submit up to:– 2 submissions per task (and per WER)– 5 answers per question
• Answers for ‘manual transcriptions’ tasks:Answer_string + Doc_ID
• Answers for ‘automatic transcriptions’ tasks:Answer_string + Doc_ID + Time_start + Time_end
Submissions
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA8
• Four possible judgments (as in QA@CLEF):Correct / Incorrect / Inexact / Unsupported
• ‘Manual transcriptions’ tasks:Manual assessment with the QASTLE interface
• ‘Automatic’ transcriptions tasksAutomatic assessment (script) + manual check
• 2 metrics:– Mean Reciprocal Rank (MRR)
measures how well right answers are ranked on average– Accuracy
fraction of correct answers ranked in the first position
Assessments
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA9
49 submissions from 5 participants:
Participants
T1a T1b T2a T2b T3a T3b T4a T4b T5a T5b
2 - - - - - 2 - - -
- - - - - - 1 2 - -
1 1 1 1 2 3 1 3 2 3
- - - - - - 1 3 - -
1 2 1 2 - - 1 6 1 6
4 3 2 3 2 3 6 14 3 9
Univ. Chemnitz (CUT)
INAOE
LIMSI
Univ. Alicante (UA)
UPC
TOTAL:
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA10
Best results for manual transcriptions
TaskT1aT2aT3aT4aT5a
FactualMRR Acc(%)0.53 47.40.47 37.80.50 45.30.44 40.00.32 29.3
DefinitionalMRR Acc(%)0.18 18.20.22 19.20.47 44.00.16 16.00.44 36.0
AllMRR Acc(%)0.45 41.00.40 33.00.49 45.00.37 34.00.35 31.0
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA11
Best results for ASR transcriptions
Task WERT1b 20.0%T2b 38.0%T3b 11.9%
23.9%35.4%
T4b 10.6%14.0%24.1%
T5b 11.5%12.7%13.7%
AllMRR Acc(%)0.34 31.0
0.20 18.0
0.45 41.0
0.30 25.0
0.24 21.0
0.33 30.0
0.24 20.0
0.23 19.0
0.26 24.0
0.23 20.0
0.25 23.0
All (manual)
MRR Acc(%)
0.45 41.0
0.40 33.0
0.49 45.0
0.37 34.0
0.35 31.0
CLEF 2008 WorkshopAarhus, September 17, 2008
ELDAELDA12
• 5 participants (as in 2007)• 4 different countries (vs. 5 in 2007)
Germany, Spain, France, Mexico• 49 submitted runs (vs. 28 runs in 2007)• Loss in accuracy with ASR transcribed speech
(performance falls when WER rises)• QAST 2009: Written & Oral Questions...
Conclusion