Multilingual accessibility and audiovisual media

22
memad.eu [email protected] @memadproject MeMAD Project MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content. Multilingual accessibility and audiovisual media Can technology help with crossing language barriers? University of Jyväskylä Nov 25, 2020 Maarit Koponen, University of Helsinki [email protected]

Transcript of Multilingual accessibility and audiovisual media

memad.eu

[email protected]

@memadproject

MeMAD Project

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Multilingual accessibility and audiovisual media Can technology help with crossing language barriers?

University of Jyväskylä Nov 25, 2020

Maarit Koponen, University of [email protected]

Accessibility

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Audiovisual information and access services

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Linguistic accessibility

● Digital information society increases the amount of information

● Global mobility increases the need for multilingual information

● Providing information only in official languages or a lingua franca is

problematic for accessibility and inclusion

● Multilingual practices like translation and interpreting promote linguistic

accessibility – but resources are often limited

● Can technology help?

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Machine translation and accessibility

1. Machine translation as a translator’s tool – post-editing○ Making translation faster, increasing amount of translated content

2. Unedited machine translation for information purposes○ Gisting: raw machine translation can support access to information

○ Important to consider the risks: translation errors, misinformatiion

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Machine translation and audiovisual content

● Use of machine translation and post-editing appears less common for

audiovisual translation than e.g. localisation, technical translation○ Features of AV content?

○ Tools for AV translation not optimal for integration of MT and PE

● Subtitle translation typically involves rephrasing and condensation○ Subtitle length

○ Reading time

● Translating text-to-text (machine translating intralingual subtitles) vs

speech-to-text (automatic speech recognition + machine translation)

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

MeMAD - Methods for Managing Audiovisual Data

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Machine Translation in the MeMAD project

● WP4 Multimodal and multilingual machine translation○ Development work at the University of Helsinki NLP group

https://blogs.helsinki.fi/language-technology/

● Fully-automatic MT and MT as a translator’s tool○ Main focus on MT for interlingual subtitling, also metadata and descriptions

○ Main interest news & current affairs programming; some tests with lifestyle/culture

● Main languages Finnish, Swedish, English (+ French, German)

MT as a translator’s

tool

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

MT and post-editing for subtitling

● Experiments at Finnish public broadcasting company Yle○ First round November/December 2019 – process data + user experience

○ Second round summer/fall 2020 – main focus user experience

● Language pairs: Finnish↔English, Finnish↔Swedish

● Professional subtitle translators as participants

● Task for participants was to subtitle 6 short video clips (~3 min)○ Produce subtitles suitable for broadcasting

○ Participants used their preferred subtitling software, MT imported as SRT file

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Screencapture of subtitling software (Wincaps Q4)

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Effect of MT on translators’ processes● Keylogging data in first round; intralingual

subtitles as source text

● MT + post-editing on average slightly

faster than translation from scratch

● Fewer keystrokes needed on average for

MT + post-editing

● Considerable variation between

participants and clips!

See Koponen et al. 2020. ”MT for subtitling: User evaluation of post-editing

productivity.” In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, 115-124.

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Assessment of user experience

● After each PE task, participants

filled in a questionnaire with 14

adjective pairs describing their

experience(modified version of User Experience

Questionnaire, Laugwitz et al. 2008)

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

User experience(from 2nd round, 2020)

Average UEQ scores scaled

between [−3, +3]

Values between [−0.8, +0.8]

considered neutral evaluations

For further discussion of 1st round findings see

Koponen et al. 2020. ” MT for Subtitling: Investigating

professional translators' user experience and feedback”.

Proceedings of AMTA 2020 Workshop “Post-editing in Modern-Day Translation”

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Key takeaways from translators’ comments”Sometimes surprisingly good, sometimes surprisingly bad”

● MT quality considered overall good; useful for some content

○ Issues with MT quality: mistranslations, unidiomaticity, ”odd” word choices

● Main criticisms (especially 1st round) concerned subtitle spotting

segmentation and timing very important for translators!

● Concerns about the effect on processes and quality of final translation

● While participants were critical toward MT, most were willing to consider

using it in some contexts – when intralingual subtitles were used as source

● Speech-to-text translation quality was deemed not useful for post-editing

Fullyautomatic

subtitletranslation

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Viewer reception of MT subtitles

● Automatic speech recognition + subtitle generation + machine translation○ No MT post-editing, some experiments involved editing of ASR output before MT

● News and documentary content

● Focus group interviews; June 2020 & October 2020○ Finnish-speaking viewers: Swedish video subtitled into Finnish

○ English-speaking viewers: Finnish video subtitled into English

● Survey; October 2020○ Finnish video clips subtitled into English; target audience with limited or no Finnish

MT subtitles –an example

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Preliminary findings from focus groups● First reaction often critical.

○ Comments on both machine translation quality and spotting of the subtitles

● English groups were overall more positive and saw MT as a support for

accessing information; particularly what is happening locally.

● Finnish groups saw less need; usable for niche interests where information is

not available in Finnish or e.g. English.

● Emphasis on quality and reliability of information

– balanced against the need to obtain information quickly or at all.

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Preliminary findings from the survey

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

In conclusion

● MT and post-editing may be a suitable tool also for subtitle translators

– but there are still issues to be solved.

● The importance of usability, user experience and effects on the user.

● Fully automatic subtitle translation may offer support for accessibility

– but quality is a critical issue.

● The importance of usability, acceptability and reliability:

which content, when, how?

memad.eu

[email protected]

@memadproject

MeMAD Project

MeMAD project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 780069. This presentation has been produced by theMeMAD project. The content in this presentation represents the views of the authors, and the European Commission has no liability in respect of the content.

Kiitos! – Thank you!

[email protected]

In collaboration with: Yle: Kaisa Vitikainen, Tiina TuominenUH: Umut Sulubacak, Jörg Tiedemann