ASR-based corrective feedback on pronunciation: does it really work?

Ambra Neri, Catia Cucchiarini, Helmer Ambra Neri, Catia Cucchiarini, Helmer StrikStrikCentre for Speech and Language TechnologyCentre for Speech and Language Technology

Dept. of Linguistics - Radboud Dept. of Linguistics - Radboud University University NijmegenNijmegen

The NetherlandsThe Netherlands

ASR-based corrective ASR-based corrective feedback on pronunciation: feedback on pronunciation: does it really work?does it really work?

OutlineOutline

• Background & problemBackground & problem

• Goal of present studyGoal of present study

• ExperimentExperiment

• ConclusionsConclusions

Background and problemBackground and problem

Computer Assisted Pronunciation Training Computer Assisted Pronunciation Training (CAPT)(CAPT)

ASR-based CAPT: ASR-based CAPT:

can provide automatic, instantaneous, can provide automatic, instantaneous, individual feedback on pronunciation in a individual feedback on pronunciation in a private environmentprivate environment

But ASR-based CAPT suffers from But ASR-based CAPT suffers from limitations. limitations.

Is it effective in improving L2 pronunciation? Is it effective in improving L2 pronunciation?

Very few studies with different results.Very few studies with different results.

Goal of this studyGoal of this study

To study the effectiveness and possible To study the effectiveness and possible advantage of automatic feedback advantage of automatic feedback provided by an ASR-based CAPT provided by an ASR-based CAPT system.system.

ASR-based CAPT sASR-based CAPT system: ystem: Dutch CAPTDutch CAPT

Target usersTarget users adult learners of Dutch with different L1's adult learners of Dutch with different L1's (e.g. immigrants)(e.g. immigrants)

L1’s

Pedagogical goalPedagogical goal improving improving segmentalsegmental quality in quality in

pronunciationpronunciation

Dutch CAPTDutch CAPT: feedback: feedback

ContentContent: focus on problematic phonemes,: focus on problematic phonemes,

11 ‘targeted phonemes’ : 9 vowels and 2 11 ‘targeted phonemes’ : 9 vowels and 2 consonantsconsonants

Criteria

Error detection algorithmError detection algorithm::

based on GOP method (Witt & Young 2000)based on GOP method (Witt & Young 2000)

VideoVideo

Dutch CAPTDutch CAPTGender-specific, Dutch & English version. Gender-specific, Dutch & English version.

4 4 units, each containing: units, each containing:

11 video video (from (from NieuweNieuwe Buren Buren) with real-life + ) with real-life + amusing situationsamusing situations

+ ca. 30 exercises based on video: dialogues, + ca. 30 exercises based on video: dialogues, question-answer, minimal pairs, word question-answer, minimal pairs, word repetitionrepetition

Sequential, constrained navigation: min. one Sequential, constrained navigation: min. one attempt needed to proceed to next exercise, attempt needed to proceed to next exercise, maximum 3maximum 3

Method: participants & Method: participants & trainingtraining

Regular teacher-fronted lessons: 4-6 hrs per week

a) Experimental group (EXP): n=15 (10 F, 5 M) Dutch CAPT

b) Control group 1 (NiBu): n=10 (4 F, 6 M) reduced version of Nieuwe Buren

c) Control group 2 (noXT): n=5 (3 F, 2 M) no extra training

Extra training: 4 weeks x 1 session 30’ – 60’

1 class – 1 type of training

Method: testingMethod: testing3 analyses:3 analyses:

1.1. Participants’ evaluations: questionnaires Participants’ evaluations: questionnaires on system’s usability, accessibility, on system’s usability, accessibility, usefulness etc.usefulness etc.

2.2. Global segmental quality: 6 experts Global segmental quality: 6 experts rated stimuli on 10-point scale (pretest/posttest, phonetically balanced sentences)

3.3. In-depth analysis of In-depth analysis of segmental errorssegmental errors: : expert annotationsexpert annotations

Results: participants’ Results: participants’ evaluationsevaluations

Positive reactionsPositive reactions

Enjoyed working with the systemEnjoyed working with the system

Believed in the usefulness of the Believed in the usefulness of the systemsystem

Results: reliability global Results: reliability global ratingsratings

Cronbach’s Cronbach’s ::

Intrarater: .94 – 1.00Intrarater: .94 – 1.00

Interrater: .83 - .96Interrater: .83 - .96

Results: Global ratingsResults: Global ratings

All 3 groups improve (mean improvement)All 3 groups improve (mean improvement)EXP improved mostEXP improved most

3

3,5

4

4,5

5

5,5

6

6,5

pre post

EXP

NiBu

noXT

In-depth analysis segm. In-depth analysis segm. qualityquality

0%

5%

10%

15%

20%

25%

EXP NiBu EXP NiBu

targeted untargeted

pretest

posttest

ConclusionsConclusions Participants enjoyed Dutch CAPT.Participants enjoyed Dutch CAPT. ASR-CAPT seems efficacious in improving ASR-CAPT seems efficacious in improving

pronunciation of targeted phonemes.pronunciation of targeted phonemes. Global ratings are appropriate measure Global ratings are appropriate measure

because CAPT should ultimately improve because CAPT should ultimately improve overall pronunciation quality. overall pronunciation quality.

Fine-grained analyses also useful.Fine-grained analyses also useful.

.

The endThe end

Questions?Questions?

Possible improvementsPossible improvements

• Give feedback on more phonemesGive feedback on more phonemesMore targeted systems for fixed L1-L2 More targeted systems for fixed L1-L2 pairs.pairs.

• Give feedback on suprasegmentalGive feedback on suprasegmental

• Increase sample sizeIncrease sample size• Increase training intensityIncrease training intensity• Match training groups: L1’s, proficiency, Match training groups: L1’s, proficiency,

etc.etc.

The endThe end

Questions?Questions?

ASR-based corrective feedback on pronunciation: does it really work?

Documents

Transcript of ASR-based corrective feedback on pronunciation: does it really work?