transLectures @Internet of Education 2013

36
Internet of Education 2013 12 November 2013 Gonçal Garcés Díaz-Munío Universitat Politècnica de València EC FP7 ICT project #287755

description

An overview of transLectures, plus results at the end of the second year of the project. Internet of Education 2013, 11-12 Nov 2013, Ljubljana, Slovenia. Video of the presentation available online at: http://videolectures.net/internetofeducation2013_diaz_munio_translectures/

Transcript of transLectures @Internet of Education 2013

Page 1: transLectures @Internet of Education 2013

Internet of Education 2013

12 November 2013

Gonçal Garcés Díaz-Munío

Universitat Politècnica de València

EC FP7 ICT project #287755

Page 2: transLectures @Internet of Education 2013

Outline

• transLectures (tL): motivation and goals

• Video demo

• tL technologies

• Progress and results• Scientific evaluations

• User evaluations

• Quality control (expert evaluations)

• Implementation and integration

• tL open source tools

12 Nov 2013 2

Page 3: transLectures @Internet of Education 2013

Motivation

• Video lecture repositories and MOOCs• Thousands of hours of video lectures available

• Hundreds of hours of video lectures recorded every week

• Most video lectures only available in their original language• No subtitles

12 Nov 2013 3

Page 4: transLectures @Internet of Education 2013

Motivation

• Transcriptions and translations are needed• Accessibility for people with disabilities

• Accessibility for speakers of different languages

• Search and analysis functions

• Automated topic finding

• …

12 Nov 2013 4

Page 5: transLectures @Internet of Education 2013

Motivation

• Transcriptions and translations are needed• Accessibility for people with disabilities

• Accessibility for speakers of different languages

• Search and analysis functions

• Automated topic finding

• …

• How do we get there?

12 Nov 2013 5

Page 6: transLectures @Internet of Education 2013

The transLectures approach

1. Automatic Speech Recognition (ASR) and Machine Translation (MT)• Adaptation: Taking advantage of the characteristics

of video lecture repositories

• High-quality automatic transcriptions and translations

2. Interactive postediting: intelligent interaction for reduced effort

12 Nov 2013 6

Page 7: transLectures @Internet of Education 2013

Goals

• Massive adaptation

• Intelligent interaction

• Implementation• Case studies: Videolectures.NET & Polimedia

• Real-life evaluation

• Integration into Opencast Matterhorn

http://opencast.org/matterhorn/

12 Nov 2013 7

Page 8: transLectures @Internet of Education 2013

The transLectures partners

12 Nov 2013 8

Name Country

1 Universitat Politècnica de València Spain

2 Xerox SAS France

3 Institut Jožef Stefan Slovenia

3+ Knowledge for All Foundation UK

4 RWTH Aachen University Germany

5 EML – European Media Laboratory Germany

6 DDS – Deluxe Digital Studios UK

Page 9: transLectures @Internet of Education 2013

Languages• Transcription (ASR)

• EN

• SL

• ES

• Translation (MT)• EN>SL , SL>EN

• EN>ES , ES>EN

• EN>FR

• EN>DE

12 Nov 2013 9

Page 10: transLectures @Internet of Education 2013

transLectures: video demo

http://www.translectures.eu/new-demo-video/

12 Nov 2013 10

Page 11: transLectures @Internet of Education 2013

Massive adaptation

12 Nov 2013 11

• Characteristics

of video lecturesJust one person

Known speaker

Clear talking

No interruptions

Focused on a topic

Slides

Page 12: transLectures @Internet of Education 2013

Massive adaptation

• Known speaker and topic

• Slides

• Related documents

12 Nov 2013 12

Page 13: transLectures @Internet of Education 2013

Scientific evaluations (Y2)

• Transcription results• WER: Word Error Rate (%)

• Goal: WER < 25%

• EN, SL, ES

12 Nov 2013 13

Worse

Better

Page 14: transLectures @Internet of Education 2013

Scientific evaluations (Y2)

• Translation results• BLEU

• Goal: BLEU > 30

• EN>SL , SL>EN

• EN>ES , ES>EN

• EN>FR

• EN>DE

12 Nov 2013 14

Better

Worse

Page 15: transLectures @Internet of Education 2013

Y1 results and comparison

12 Nov 2013 15

Page 16: transLectures @Internet of Education 2013

Y1 results and comparison

12 Nov 2013 16

Page 17: transLectures @Internet of Education 2013

Y1 results and comparison

12 Nov 2013 17

Page 18: transLectures @Internet of Education 2013

Intelligent interaction

• Postediting automatic transcriptions/translations• The user invests the least possible effort

• The system learns the most from it

• Confidence measures

• Fast constrained search

12 Nov 2013 18

Page 19: transLectures @Internet of Education 2013

Intelligent interaction

12 Nov 2013 19

Page 20: transLectures @Internet of Education 2013

Intelligent interaction

12 Nov 2013 20

Page 21: transLectures @Internet of Education 2013

Intelligent interaction

• Batch interaction vs. Intelligent interaction

12 Nov 2013 21

Page 22: transLectures @Internet of Education 2013

Intelligent interaction

• User evaluations• UPV (Polimedia)

• JSI (Videolectures.NET)

12 Nov 2013 22

Page 23: transLectures @Internet of Education 2013

User evaluations

• User evaluations at UPV• Users: lecturers

• Revising their own lectures

• 3 different experiments1. Complete supervision

2. Intelligent interaction

3. Two-round supervision

12 Nov 2013 23

Page 24: transLectures @Internet of Education 2013

User evaluations

12 Nov 2013 24

1. Complete supervision

Page 25: transLectures @Internet of Education 2013

User evaluations

12 Nov 2013 25

2. Intelligent interaction

Page 26: transLectures @Internet of Education 2013

User evaluations

12 Nov 2013 26

3. Two-round supervision

Page 27: transLectures @Internet of Education 2013

User evaluations

• User evaluations at UPV: results

12 Nov 2013 27

Page 28: transLectures @Internet of Education 2013

User evaluations

• User evaluations at UPV: results

12 Nov 2013 28

Page 29: transLectures @Internet of Education 2013

User evaluations

• User evaluations at UPV: results

12 Nov 2013 29

Page 30: transLectures @Internet of Education 2013

Quality control: expert evaluations

• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions

• Avg. WER: 23.2 ; Avg. RTF: 3.8

12 Nov 2013 30

Page 31: transLectures @Internet of Education 2013

Quality control: expert evaluations

• Transcription quality (EN, ES, SL)• UPV: Representative set of Spanish transcriptions

• Avg. WER: 23.2 ; Avg. RTF: 3.8

• Translation quality (EN<>SL, EN<>ES, EN>FR, EN>DE)• UPV: Representative set of Spanish>English translations

• Avg. BLEU: 46.6 ; Avg. RTF: 14.8 ; Avg. Score: 3.6 out of 5

12 Nov 2013 31

Page 32: transLectures @Internet of Education 2013

Implementation and integration

• Videolectures.NET

• Polimedia

• Opencast Matterhorn

12 Nov 2013 32

Page 33: transLectures @Internet of Education 2013

• Polimedia

12 Nov 2013 33

Page 34: transLectures @Internet of Education 2013

transLectures: Open source tools

• The tL player (& editor)• Coming soon (www.translectures.eu)

• The transLectures-UPV Toolkit (TLK) for ASR• www.translectures.eu/tlk

• RWTH Aachen: rASR, Jane (MT)• http://www-i6.informatik.rwth-aachen.de/web/Software/

12 Nov 2013 34

Page 35: transLectures @Internet of Education 2013

Next steps for transLectures

• Keep improving ASR and MT results

• Keep improving tL open source tools (TLK, tL player)

• External user evaluations (VL.NET and polimedia)

• External trials: implementation in other universities

12 Nov 2013 35

Page 36: transLectures @Internet of Education 2013

www.translectures.eu

• About this presentation:

Gonçal Garcés [email protected]

• Project coordinator:

Alfons Juan-Ciscar [email protected]

EC FP7 ICT Programme – Project Number 287755

12 Nov 2013 36