TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

20
MediaMosa @ 5 th TF-Media Workshop Porto, October 26, 2011 - SURFnet. We make innovation work ` Frans Ward Technical Product Manager SURFnet Advanced Services MediaMosa Transcripting Technology Scouting Project and Proof of Concept Friday, October 28, 11

description

MediaMosa Transcripting 
Technology Scouting Project and
Proof of ConceptPresentation at TF-Media meeting in Porto, Portugal, 28 October 2011Presenter, Frans Ward , SURFnet

Transcript of TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

Page 1: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

`

Frans WardTechnical Product ManagerSURFnet Advanced Services

MediaMosa Transcripting Technology Scouting Project and

Proof of Concept

Friday, October 28, 11

Page 2: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 3: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 4: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly

• Archiving is not enough: disclosure and reusing is required!

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 5: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly

• Archiving is not enough: disclosure and reusing is required!

• The use of speech technology is needed (Reduce human effort).

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 6: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 7: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly.

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 8: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly.

• Archiving is not enough: disclosure and reusing is required!

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 9: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly.

• Archiving is not enough: disclosure and reusing is required!

• Adding Metadata is the key component here.

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 10: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• The number of AV-archives on the Internet increases rapidly.

• Archiving is not enough: disclosure and reusing is required!

• Adding Metadata is the key component here.

• The use of speech technology is needed (Reduce human effort).

Disclosure  of  audiovisual  archives

UK National Film and Television Archive, Berkhamstedhttp://www.flickr.com/people/footage/

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 11: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

Huge amount of workand no time-coded relations with video

Adding metadata, the traditional approach:Manual annotation

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 12: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

Adding metadata, the new approach:Using speech-to-text technology for metadata generation

Speech Recognition(Speech-to-Text)Time-coded Transcript

Indexing and Search:Search on fragment level

Audio Extraction

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 13: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGY

• Transcripting: conversion of speech into an electronic text document.

• Automatic Speech Recognition (ASR) seems to be the ideal technology for this.

• In combination with Optical Character Recognition (OCR) of slides.

• Goal: to provide additional metadata for searching in video / lecture recordings.

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 14: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MEDIAMOSA TRANSCRIPTING TECHNOLOGYThe Technology Scout Project. The process is complex...

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 15: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

MEDIAMOSA TRANSCRIPTING TECHNOLOGY SCOUTING PROJECT

MediaMosaTranscription by Spraak /Cmu Sphinx

Multi-SourcePlayer

Partners:

• Enhanced Search• Optional Subtitles• Mashup info

Lecture Recording

End User Application

• Recognize the Speech• Produce time-coded

Transcript

• Recording of Teacher• Recording of Slides• Reference material

• Transcode into audio• Store all into an asset

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 16: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

MEDIAMOSA TRANSCRIPTING PROJECT

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 17: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

MEDIAMOSA TRANSCRIPTING PROJECT

Friday, October 28, 11

Page 18: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

MEDIAMOSA TRANSCRIPTING PROJECTSubtitles:

Friday, October 28, 11

Page 19: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MediaMosa 3.5

Focus on transcription technology (speech-2-text) and flexible workflows

• Development is started• beta release available: december 2011

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11

Page 20: TF-Media Porto - MediaMosa Transcription Technology - October 28 2011

SURFnet. We make innovation work1

MediaMosa Directions

Q&A

MediaMosa

MediaMosa

MediaMosa

Thanks

for yo

ur

attenti

on!

WWWhttp://mediamosa.org

Online Demohttp://demo.mediamosa.org

Forumhttp://mediamosa.org/forum

Issue Trackerhttp://mediamosa.org/trac

Source Codehttps://github.com/mediamosa

Slidesharehttp://www.slideshare.net/MediaMosa

Twitterhttp://twitter.com/mediamosa

MediaMosa @ 5th TF-Media WorkshopPorto, October 26, 2011 - SURFnet. We make innovation work

Friday, October 28, 11