Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL...
Transcript of Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL...
![Page 1: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/1.jpg)
“Hey Siri, can I learn English by talking to you?”
Insights from a multilevel meta-analysis onthe effectiveness of dialogue-based CALL
Serge BibauwWim Van den Noortgate
Thomas FrançoisPiet Desmet
CALL 2018 ConferenceJuly 5, 2018
![Page 2: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/2.jpg)
Dialogue-based CALL
bots
AppleSiri
![Page 3: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/3.jpg)
Meta-analysis of effectiveness studies
! = <">
!1 !… !k
Random-effects multilevel model
! "#Pre 56 4.56
Post 61 6.54
![Page 4: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/4.jpg)
Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL
Object: dialogue-based CALLDialogue systems, chatbots, agents
Methods: meta-analysisStudies collection and selection, effect sizes calculation and multilevel modeling
Results: effectiveness for L2 learningGeneral effectivenessRelative effects per population, treatment characteristics and outcome variables
![Page 5: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/5.jpg)
Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL
Object: dialogue-based CALLDialogue systems, chatbots, agents
Methods: meta-analysisStudies collection and selection, effect sizes calculation and multilevel modeling
Results: effectiveness for L2 learningGeneral effectivenessRelative effects per population, treatment characteristics and outcome variables
![Page 6: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/6.jpg)
Dialogue-based CALL
Dialogue-based CALL refers toany application or system allowing,
to maintain a dialogue[ immediate, synchronous interaction ][ written or spoken ]
with an automated agent[ tutorial CALL (≠ CMC) ]
for language learning purposes.
Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review
![Page 7: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/7.jpg)
Dialogue-based CALLTypology of systems (Bibauw et al, under review)
Form-focused dialogue systemsExplicit constraints on meaning, focus on form/formse.g., ICALL intelligent language tutors, and Computer-assisted pronunciation training (CAPT) systems
Goal-oriented dialogue systemsContextual constraints (task, situated conversation...), mostly focus on meaning and interactione.g., Conversational agents in virtual worlds
Reactive dialogue systemsFree, user-initiated, open-ended dialoguee.g., Chatbots, and personal assistants
Here, simplified typology (left out Narrative systems)
![Page 8: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/8.jpg)
Dialogue-based CALLRecent evolutions
Rich history of studies & systems:• First attempts in the 80s (Underwood 1982, 1984)
• Intelligent Language Tutors developed in the 90s (Holland et al, 1995)
• Efforts with speech and dialogue in the 2000s (Raux & Eskenazi, 2004; Seneff et al, 2007; Morton et al 2012)
• Principled technological convergence more recently (Petersen, 2010; Wilske, 2015)
But nearly all systems remained internal, research-only prototypes, never made accessible to the public. → No comparability, no replicabilityBut, recently, major advances towards publicly available tools (Duolingo Bots, Alelo Enskill, ETS HALEF) and joint efforts between industry and researchers to compare the systems and establish common ground (Sydorenko et al, 2018)
![Page 9: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/9.jpg)
Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL
Object: dialogue-based CALLDialogue systems, chatbots, agents
Methods: meta-analysisStudies collection and selection, effect sizes calculation and multilevel modeling
Results: effectiveness for L2 learningGeneral effectivenessRelative effects per population, treatment characteristics and outcome variables
![Page 10: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/10.jpg)
Meta-analysis of effectiveness studies
Aggregate results from multiple experimental studiesTreat each study as a subject
Get a more powerful, generalizable, stable and precise idea of the effectiveness of dialogue-based CALL on language learningAnalyzing certain moderator variables to identify tendencies inside the data
Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, in prep.
![Page 11: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/11.jpg)
Meta-analysisSearch & collection process
1. Database searchin Web of Science, Scopus, ProQuest…Search syntax:(chatbot / chat bot / chatterbot / conversational agent / conversational companion / conversational system / dialog* system / dialog* agent / dialog* game / pedagogical agent / human-computer dialog* / dialog*-based) + ((language / English) (learning / teaching / acquisition) / (second / foreign) language / L2 / EFL / ESL / ICALL)
2. Ancestry searchOlder publications cited by ref
3. Forward citationsNew publications citing ref
Note on journal search: 40/250 publications from the 4 major CALL journals (19 CALL, 13 CALICO J., 4 ReCALL, 4 LL&T)
![Page 12: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/12.jpg)
Meta-analysisInclusion/exclusion process
Records identified through database searching:153 Scopus
75 Web of Science Core Collection68 Inspec38 PsycINFO38 LLBA36 ERIC13 ProQuest Central
9 MLA International Bibliography4 LISA
419 records screened after removing duplicates
136 excluded
Excluded at screening level:27 full-text unavailable
3 republications3 publication in other languages
39 articles/studies included
Excluded non (quasi-)experimental studies: 118 without empirical data
40 with technical evaluation on datasets26 with observational or qualitative data27 with survey data
k = 134 reported effect sizes
Isolated effect sizes per sample and per outcome variable
k = 96 effect sizes included
250 articles relevant to dialogue-based CALL
386 articles undergo full-text review
33 excluded
211 excluded
Excluded for not fitting “dialogue-based CALL” criteria:64 no application to L2 learning20 interlocutor is not a system (or no interlocutor)39 item-based interactions (no multi-turn dialogue)13 dialogue only for scaffolding, not as task
Additional records identified through forward
and ancestry search: 193 records
Excluded effect sizes: 13 not reporting precise central tendency (e.g., mean)
8 not reporting variance (e.g., standard deviation) or alternate metrics to compute d (e.g., t statistics)
6 lack of reference data (e.g., no pretest nor control)11 effects on other outcomes (e.g., motivation)
36 excluded(also excluding 18 source articles)
17 articles/studies+ 4 articles reporting on the same data
![Page 13: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/13.jpg)
1980 1990 2000 2010
Narrative
Form−focused
Goal−oriented
Reactive
NA
Studies on dialogue-based CALL
250 papers114 different systems
![Page 14: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/14.jpg)
Meta-analysisInclusion/exclusion process
Records identified through database searching:153 Scopus
75 Web of Science Core Collection68 Inspec38 PsycINFO38 LLBA36 ERIC13 ProQuest Central
9 MLA International Bibliography4 LISA
419 records screened after removing duplicates
136 excluded
Excluded at screening level:27 full-text unavailable
3 republications3 publication in other languages
39 articles/studies included
Excluded non (quasi-)experimental studies: 118 without empirical data
40 with technical evaluation on datasets26 with observational or qualitative data27 with survey data
k = 134 reported effect sizes
Isolated effect sizes per sample and per outcome variable
k = 96 effect sizes included
250 articles relevant to dialogue-based CALL
386 articles undergo full-text review
33 excluded
211 excluded
Excluded for not fitting “dialogue-based CALL” criteria:64 no application to L2 learning20 interlocutor is not a system (or no interlocutor)39 item-based interactions (no multi-turn dialogue)13 dialogue only for scaffolding, not as task
Additional records identified through forward
and ancestry search: 193 records
Excluded effect sizes: 13 not reporting precise central tendency (e.g., mean)
8 not reporting variance (e.g., standard deviation) or alternate metrics to compute d (e.g., t statistics)
6 lack of reference data (e.g., no pretest nor control)11 effects on other outcomes (e.g., motivation)
36 excluded(also excluding 18 source articles)
17 articles/studies+ 4 articles reporting on the same data
![Page 15: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/15.jpg)
Meta-analysisInclusion/exclusion process
Records identified through database searching:153 Scopus
75 Web of Science Core Collection68 Inspec38 PsycINFO38 LLBA36 ERIC13 ProQuest Central
9 MLA International Bibliography4 LISA
419 records screened after removing duplicates
136 excluded
Excluded at screening level:27 full-text unavailable
3 republications3 publication in other languages
39 articles/studies included
Excluded non (quasi-)experimental studies: 118 without empirical data
40 with technical evaluation on datasets26 with observational or qualitative data27 with survey data
k = 134 reported effect sizes
Isolated effect sizes per sample and per outcome variable
k = 96 effect sizes included
250 articles relevant to dialogue-based CALL
386 articles undergo full-text review
33 excluded
211 excluded
Excluded for not fitting “dialogue-based CALL” criteria:64 no application to L2 learning20 interlocutor is not a system (or no interlocutor)39 item-based interactions (no multi-turn dialogue)13 dialogue only for scaffolding, not as task
Additional records identified through forward
and ancestry search: 193 records
Excluded effect sizes: 13 not reporting precise central tendency (e.g., mean)
8 not reporting variance (e.g., standard deviation) or alternate metrics to compute d (e.g., t statistics)
6 lack of reference data (e.g., no pretest nor control)11 effects on other outcomes (e.g., motivation)
36 excluded(also excluding 18 source articles)
17 articles/studies+ 4 articles reporting on the same data
![Page 16: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/16.jpg)
Coding scheme
Publication variablesauthor, year, publication type, source, sample...Population variablescontext, age, L1, L2 proficiency levelTreatment variablesexperimental design, treatment duration (weeks), time on task (hours), number of sessions, treatment density (packed vs. spaced)System variablessystem, target L2, system_type, dialogue_type, primary_modality, corrective_feedback, initiative, embodied_agent, gamified...Instruments/outcome variablesproficiency/complexity/accuracy/fluency/vocabulary, speaking/writing, specific test
Quantitative resultsn, mean, sd (pre/post, experimental/control)
![Page 17: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/17.jpg)
Meta-analysisComputable effect sizes
Effect size: standardized measure of the observed (here, learning) effectEffect size (!) typically computed over:
• mean• standard deviation• n (subjects)for each group/measurement point(or alternate: t-score, etc.)
Not available for all studies (especially older studies)Asked the authors for raw data (worked for some – thanks to them!)
![Page 18: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/18.jpg)
Meta-analysisInclusion of individual effect sizes
Records identified through database searching:153 Scopus
75 Web of Science Core Collection68 Inspec38 PsycINFO38 LLBA36 ERIC13 ProQuest Central
9 MLA International Bibliography4 LISA
419 records screened after removing duplicates
136 excluded
Excluded at screening level:27 full-text unavailable
3 republications3 publication in other languages
39 articles/studies included
Excluded non (quasi-)experimental studies: 118 without empirical data
40 with technical evaluation on datasets26 with observational or qualitative data27 with survey data
k = 134 reported effect sizes
Isolated effect sizes per sample and per outcome variable
k = 96 effect sizes included
250 articles relevant to dialogue-based CALL
386 articles undergo full-text review
33 excluded
211 excluded
Excluded for not fitting “dialogue-based CALL” criteria:64 no application to L2 learning20 interlocutor is not a system (or no interlocutor)39 item-based interactions (no multi-turn dialogue)13 dialogue only for scaffolding, not as task
Additional records identified through forward
and ancestry search: 193 records
Excluded effect sizes: 13 not reporting precise central tendency (e.g., mean)
8 not reporting variance (e.g., standard deviation) or alternate metrics to compute d (e.g., t statistics)
6 lack of reference data (e.g., no pretest nor control)11 effects on other outcomes (e.g., motivation)
36 excluded(also excluding 18 source articles)
17 articles/studies+ 4 articles reporting on the same data
! = 96 effect sizes
![Page 19: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/19.jpg)
Meta-analysisEffect size calculation
Effect size: standardized measure of the observed (here, learning) effectUsually, in SLA/CALL:
Standardized Mean DifferenceCohen’s ! (!post –!pre / SDpooled)Hedge’s "
Standardized Mean Change
Exp. Grp! (sd)
Control! (sd)
Pre 56 (4.3) 54 (5.6)
Post 61 (6.2) 57 (7.4)
! (sd)
Pre 56 (4.3)
Post 61 (6.2)
Exp. Grp! (sd)
Control! (sd)
Post 61 (6.2) 57 (7.4)
EC PPECPP
![Page 20: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/20.jpg)
Meta-analysisA comparable effect size metrics
Morris & DeShon (2002) offer a solution: comparable metrics across experimental designs (EC / PP / ECPP)• change metric (aligned on within-group effect)• raw metric (aligned on between-groups effect)
We selected the raw metric formula:
![Page 21: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/21.jpg)
Meta-analysisSummary effect size
Model computes a summary effect by aggregating all the single study effect sizesWeighting according to sample size and precision
à More powerful, more stable, more precise and generalizable than the individual study effect sizes
![Page 22: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/22.jpg)
Meta-analysisMultilevel modeling
Publications report multiple outcome measures (e.g., vocabulary and morphology tests) or multiple sampling groups (e.g., proficiency levels)Traditional meta-analysis techniques allow only one (independent) effect size per study, but loosing thus all the information on distinct implementations⇒ Including all the variation without “fooling” the model with non-independent measures:Multilevel modelling
Aggregates multiple effects per study, by adding an intermediate level of within-study variation.
![Page 23: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/23.jpg)
Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL
Object: dialogue-based CALLDialogue systems, chatbots, agents
Methods: meta-analysisStudies collection and selection, effect sizes calculation and multilevel modeling
Results: effectiveness for L2 learningGeneral effectivenessRelative effects per population, treatment characteristics and outcome variables
![Page 24: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/24.jpg)
Multilevel random−effects model
−2 −1 0 1 2 3 4
Effect size d
Wolska & Wilske 2011
Wolska & Wilske 2010b
Wolska & Wilske 2010a
Wilske 2015
Wilske & Wolska 2011
Bouillon et al 2011
Rosenthal... et al 2016
Chiu et al 2007Noh et al 2012
Lee et al 2014a
Lee et al 2011a
Hassani et al 2016
Harless et al 1999
Petersen 2010
Kim 2016
Taguchi et al 2017
Jia et al 2013
Free prod. > speech rateFree prod. > phonation time ratioFree prod. > length of pausesGap−filling > speech rateGap−filling > phonation time ratioGap−filling > length of pausesFree prod. > GJT *delayedFree prod. > GJT *postFree prod. > SCT *delayedFree prod. > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. > GJT *delayedFree prod. > GJT *postFree prod. > SCT *delayedFree prod. > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. > speech rateFree prod. > phonation time ratioFree prod. > length of pausesFree prod. + recast > GJT *delayedFree prod. + recast > GJT *postFree prod. + recast > SCT *delayedFree prod. + recast > SCT *postFree prod. + ML CF > GJT *delayedFree prod. + ML CF > GJT *postFree prod. + ML CF > SCT *delayedFree prod. + ML CF > SCT *postGap−filling > speech rateGap−filling > phonation time ratioGap−filling > length of pausesGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. + recast > GJT *delayedFree prod. + recast > GJT *postFree prod. + recast > SCT *delayedFree prod. + recast > SCT *postFree prod. + ML CF > GJT *delayedFree prod. + ML CF > GJT *postFree prod. + ML CF > SCT *delayedFree prod. + ML CF > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *post> translation test> grammar testSpeech−only system, TTS voiceSpeech−only system, prerecorded voicePhysical robot, TTS voicePhysical robot, prerecorded voiceVirtual agent, TTS voiceVirtual agent, prerecorded voice(not Engl. major) > DCT, use of speech acts(Engl. major) > DCT, use of speech acts(not Engl. major) > DCT, comprehensibility(Engl. major) > DCT, comprehensibility
> nb of words> nb of grammatical errors(A2) > hol. vocabulary rating(A1) > hol. vocabulary rating(A2) > hol. communicative ability rating(A1) > hol. communicative ability rating(A2) > hol. pronunciation rating(A1) > hol. pronunciation rating(A2) > hol. grammar rating(A1) > hol. grammar rating(A2) > listening compr.(A1) > listening compr.> Automatic prof. score> Phonation time/letter> Nb of proper replies> Grammatical errors/sentence> speaking prof.> reading comp.> listening comp.> QFT, syntax score> QFT, morphology score(B1 sample)(A2 sample)(A1 sample)> multiple choice test *delayed> multiple choice test *post> gap−filling test *delayed> gap−filling test *post(sample Jingxian JHS)(sample Huojia N1 SHS)(sample Huiwen JHS)
6667777777777766666666777
11161014
9191019
777
1014101411111111
99
1010101011111010222222222222202920294025251110111011101110111010101010
999
191921222030303030485637
1818162220
475634
0.39 [−0.62, 1.41] 1.85 [−0.20, 3.90] 1.23 [−0.30, 2.77] 0.38 [−0.50, 1.26] 0.46 [−0.45, 1.36]−0.32 [−1.19, 0.55] 0.49 [−0.43, 1.40] 0.49 [−0.43, 1.40] 0.29 [−0.57, 1.15] 0.29 [−0.57, 1.15] 0.00 [−0.82, 0.82] 0.19 [−0.65, 1.03] 0.31 [−0.56, 1.17] 0.41 [−0.48, 1.30] 0.44 [−0.59, 1.47] 0.69 [−0.47, 1.85] 0.42 [−0.60, 1.45] 0.53 [−0.55, 1.60] 0.64 [−0.49, 1.77] 0.79 [−0.43, 2.01] 0.57 [−0.52, 1.67] 0.84 [−0.41, 2.10] 0.13 [−0.70, 0.96] 0.69 [−0.31, 1.70] 0.39 [−0.49, 1.28] 0.00 [−0.60, 0.60] 0.00 [−0.48, 0.48] 1.10 [ 0.19, 2.00] 1.21 [ 0.46, 1.96]
0.61 [−0.18, 1.40] 0.00 [−0.43, 0.43] 0.62 [−0.12, 1.35] 0.31 [−0.14, 0.76] 0.26 [−0.59, 1.11] 1.01 [−0.16, 2.18] 0.95 [−0.18, 2.08] 0.28 [−0.38, 0.94] 1.33 [ 0.54, 2.12]
0.46 [−0.23, 1.15] 0.86 [ 0.22, 1.51]
0.52 [−0.14, 1.18] 0.64 [−0.05, 1.33] 0.40 [−0.24, 1.04] 0.52 [−0.14, 1.18] 0.60 [−0.18, 1.39] 0.55 [−0.22, 1.32] 0.32 [−0.34, 0.99] 0.32 [−0.34, 0.99] 0.63 [−0.11, 1.36] 0.91 [ 0.08, 1.75]
0.59 [−0.09, 1.27] 1.10 [ 0.26, 1.95] 2.21 [ 0.78, 3.64]
0.82 [−0.16, 1.80]−0.22 [−0.62, 0.19]−0.29 [−0.70, 0.12]−0.17 [−0.57, 0.23] 0.11 [−0.29, 0.51]−0.31 [−0.72, 0.10]−0.28 [−0.69, 0.13]
0.69 [ 0.24, 1.15] 0.09 [−0.20, 0.38] 0.53 [ 0.24, 0.82]
0.02 [−0.25, 0.29] 1.36 [ 0.93, 1.79] 0.59 [ 0.18, 1.00]−0.34 [−0.73, 0.04]
1.52 [ 0.48, 2.56] 1.21 [ 0.22, 2.20] 1.74 [ 0.66, 2.83] 1.14 [ 0.17, 2.11] 1.75 [ 0.65, 2.85] 1.62 [ 0.43, 2.82] 1.18 [ 0.27, 2.08] 1.24 [ 0.34, 2.13]
−0.77 [−1.50, −0.03] 0.29 [−0.51, 1.09] 0.43 [−0.26, 1.12] 0.05 [−0.59, 0.69] 0.30 [−0.36, 0.96] 0.11 [−0.53, 0.76] 1.81 [ 0.46, 3.15] 1.35 [ 0.25, 2.46]
0.60 [−0.18, 1.39] 0.96 [ 0.16, 1.76] 0.73 [ 0.00, 1.46]
0.10 [−0.53, 0.74] 1.25 [ 0.44, 2.07] 2.21 [ 0.96, 3.46] 1.10 [ 0.65, 1.55] 1.58 [ 1.03, 2.13] 1.84 [ 1.23, 2.44] 2.00 [ 1.36, 2.65]−0.11 [−0.48, 0.27]
1.02 [ 0.58, 1.47] 0.05 [−0.38, 0.49]
0.61 [ 0.38, 0.83]
Reference w d [95% CI]
![Page 25: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/25.jpg)
Results Summary effect
General effectiveness of dialogue-based CALL for L2 proficiency development (! = 96):
" = 0.605 ***95% CI = [0.377, 0.833]= Medium effect (Plonsky & Oswald, 2014)
![Page 26: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/26.jpg)
Results & discussionSummary effect compared to CALL/SLA
Global effect close to the median of meta-analyses in CALL/SLA (Plonsky & Oswald, 2014)
• ≳ game-based learning (" = .53, Chiu et al, 2012)
• ≲ CALL in general (" = .84, Plonsky & Ziegler, 2016)Consistent with effect of face-to-face interaction (Mackey & Goo, 2007) and SCMC.• ≲ F2F interaction (" = .75, Mackey & Goo, 2007)
• ≲ SCMC (Ziegler, 2015; Lin, 2015)
Slightly inferior, but logical:
• Human interlocutors remain the gold standard!
• Outcome variables often very ambitious (holistic proficiency…) and treatment duration often very reduced (≤ 3h)
![Page 27: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/27.jpg)
Results & discussionLimitations
Standardized Mean Difference
Sta
ndar
d E
rror
0.63
40.
476
0.31
70.
159
0
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
−1 0 1 2 3 4
Lee et al 2014
Lee et al 2012
Hassani et al 2016Rayner & Tsourakis 2013
Petersen 2010Petersen 2010
Bouillon et al 2011
Wilske & Wolska 2011
Wolska & Wilske 2010b
Wolska & Wilske 2010a
Wilske & Wolska 2011
Wolska & Wilske 2010b
Wolska & Wilske 2010a
Hassani et al 2016
Lee et al 2012
Harless et al 1999
Lee et al 2012
Lee et al 2014
Hassani et al 2016
Wolska & Wilske 2011
Wilske 2014
Wolska & Wilske 2011
Wilske 2014
Kim 2016
Kim 2016
Kim 2016
Harless et al 1999
Chiu et al 2007
Lee et al 2012
Lee et al 2012
Noh et al 2012
Arispe 2014
Arispe 2014
Arispe 2014
Arispe 2014Jia et al 2013
Jia et al 2013Jia et al 2013
Bouillon et al 2011
Rosenthal−von der Putten et al 2016
High heterogeneityFew studies with strong results
Publication bias and self-evaluation bias
![Page 28: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/28.jpg)
Insights from a multilevel meta-analysis on the effectiveness of dialogue-based CALL
Object: dialogue-based CALLDialogue systems, chatbots, agents
Methods: meta-analysisStudies collection and selection, effect sizes calculation and multilevel modeling
Results: effectiveness for L2 learningGeneral effectivenessRelative effects per population, treatment characteristics and outcome variables
![Page 29: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/29.jpg)
ResultsModerator analysis
Insights about the influence of some covariates/moderators
Sample and contextcontext, age, L1, L2, proficiency levelSystem (treatment) variablessystem, system type, dialogue type, primary modality, corrective feedback, initiative, embodied agent, gamified...treatment duration (in weeks), time on task (in hours)Instruments/outcome variablesproficiency/complexity/accuracy/fluency/vocabulary, speaking/writing, specific test
![Page 30: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/30.jpg)
Multilevel random−effects model
−2 −1 0 1 2 3 4
Effect size d
Wolska & Wilske 2011
Wolska & Wilske 2010b
Wolska & Wilske 2010a
Wilske 2015
Wilske & Wolska 2011
Bouillon et al 2011
Rosenthal... et al 2016
Chiu et al 2007Noh et al 2012
Lee et al 2014a
Lee et al 2011a
Hassani et al 2016
Harless et al 1999
Petersen 2010
Kim 2016
Taguchi et al 2017
Jia et al 2013
Free prod. > speech rateFree prod. > phonation time ratioFree prod. > length of pausesGap−filling > speech rateGap−filling > phonation time ratioGap−filling > length of pausesFree prod. > GJT *delayedFree prod. > GJT *postFree prod. > SCT *delayedFree prod. > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. > GJT *delayedFree prod. > GJT *postFree prod. > SCT *delayedFree prod. > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. > speech rateFree prod. > phonation time ratioFree prod. > length of pausesFree prod. + recast > GJT *delayedFree prod. + recast > GJT *postFree prod. + recast > SCT *delayedFree prod. + recast > SCT *postFree prod. + ML CF > GJT *delayedFree prod. + ML CF > GJT *postFree prod. + ML CF > SCT *delayedFree prod. + ML CF > SCT *postGap−filling > speech rateGap−filling > phonation time ratioGap−filling > length of pausesGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *postFree prod. + recast > GJT *delayedFree prod. + recast > GJT *postFree prod. + recast > SCT *delayedFree prod. + recast > SCT *postFree prod. + ML CF > GJT *delayedFree prod. + ML CF > GJT *postFree prod. + ML CF > SCT *delayedFree prod. + ML CF > SCT *postGap−filling > GJT *delayedGap−filling > GJT *postGap−filling > SCT *delayedGap−filling > SCT *post> translation test> grammar testSpeech−only system, TTS voiceSpeech−only system, prerecorded voicePhysical robot, TTS voicePhysical robot, prerecorded voiceVirtual agent, TTS voiceVirtual agent, prerecorded voice(not Engl. major) > DCT, use of speech acts(Engl. major) > DCT, use of speech acts(not Engl. major) > DCT, comprehensibility(Engl. major) > DCT, comprehensibility
> nb of words> nb of grammatical errors(A2) > hol. vocabulary rating(A1) > hol. vocabulary rating(A2) > hol. communicative ability rating(A1) > hol. communicative ability rating(A2) > hol. pronunciation rating(A1) > hol. pronunciation rating(A2) > hol. grammar rating(A1) > hol. grammar rating(A2) > listening compr.(A1) > listening compr.> Automatic prof. score> Phonation time/letter> Nb of proper replies> Grammatical errors/sentence> speaking prof.> reading comp.> listening comp.> QFT, syntax score> QFT, morphology score(B1 sample)(A2 sample)(A1 sample)> multiple choice test *delayed> multiple choice test *post> gap−filling test *delayed> gap−filling test *post(sample Jingxian JHS)(sample Huojia N1 SHS)(sample Huiwen JHS)
6667777777777766666666777
11161014
9191019
777
1014101411111111
99
1010101011111010222222222222202920294025251110111011101110111010101010
999
191921222030303030485637
1818162220
475634
0.39 [−0.62, 1.41] 1.85 [−0.20, 3.90] 1.23 [−0.30, 2.77] 0.38 [−0.50, 1.26] 0.46 [−0.45, 1.36]−0.32 [−1.19, 0.55] 0.49 [−0.43, 1.40] 0.49 [−0.43, 1.40] 0.29 [−0.57, 1.15] 0.29 [−0.57, 1.15] 0.00 [−0.82, 0.82] 0.19 [−0.65, 1.03] 0.31 [−0.56, 1.17] 0.41 [−0.48, 1.30] 0.44 [−0.59, 1.47] 0.69 [−0.47, 1.85] 0.42 [−0.60, 1.45] 0.53 [−0.55, 1.60] 0.64 [−0.49, 1.77] 0.79 [−0.43, 2.01] 0.57 [−0.52, 1.67] 0.84 [−0.41, 2.10] 0.13 [−0.70, 0.96] 0.69 [−0.31, 1.70] 0.39 [−0.49, 1.28] 0.00 [−0.60, 0.60] 0.00 [−0.48, 0.48] 1.10 [ 0.19, 2.00] 1.21 [ 0.46, 1.96]
0.61 [−0.18, 1.40] 0.00 [−0.43, 0.43] 0.62 [−0.12, 1.35] 0.31 [−0.14, 0.76] 0.26 [−0.59, 1.11] 1.01 [−0.16, 2.18] 0.95 [−0.18, 2.08] 0.28 [−0.38, 0.94] 1.33 [ 0.54, 2.12]
0.46 [−0.23, 1.15] 0.86 [ 0.22, 1.51]
0.52 [−0.14, 1.18] 0.64 [−0.05, 1.33] 0.40 [−0.24, 1.04] 0.52 [−0.14, 1.18] 0.60 [−0.18, 1.39] 0.55 [−0.22, 1.32] 0.32 [−0.34, 0.99] 0.32 [−0.34, 0.99] 0.63 [−0.11, 1.36] 0.91 [ 0.08, 1.75]
0.59 [−0.09, 1.27] 1.10 [ 0.26, 1.95] 2.21 [ 0.78, 3.64]
0.82 [−0.16, 1.80]−0.22 [−0.62, 0.19]−0.29 [−0.70, 0.12]−0.17 [−0.57, 0.23] 0.11 [−0.29, 0.51]−0.31 [−0.72, 0.10]−0.28 [−0.69, 0.13]
0.69 [ 0.24, 1.15] 0.09 [−0.20, 0.38] 0.53 [ 0.24, 0.82]
0.02 [−0.25, 0.29] 1.36 [ 0.93, 1.79] 0.59 [ 0.18, 1.00]−0.34 [−0.73, 0.04]
1.52 [ 0.48, 2.56] 1.21 [ 0.22, 2.20] 1.74 [ 0.66, 2.83] 1.14 [ 0.17, 2.11] 1.75 [ 0.65, 2.85] 1.62 [ 0.43, 2.82] 1.18 [ 0.27, 2.08] 1.24 [ 0.34, 2.13]
−0.77 [−1.50, −0.03] 0.29 [−0.51, 1.09] 0.43 [−0.26, 1.12] 0.05 [−0.59, 0.69] 0.30 [−0.36, 0.96] 0.11 [−0.53, 0.76] 1.81 [ 0.46, 3.15] 1.35 [ 0.25, 2.46]
0.60 [−0.18, 1.39] 0.96 [ 0.16, 1.76] 0.73 [ 0.00, 1.46]
0.10 [−0.53, 0.74] 1.25 [ 0.44, 2.07] 2.21 [ 0.96, 3.46] 1.10 [ 0.65, 1.55] 1.58 [ 1.03, 2.13] 1.84 [ 1.23, 2.44] 2.00 [ 1.36, 2.65]−0.11 [−0.48, 0.27]
1.02 [ 0.58, 1.47] 0.05 [−0.38, 0.49]
0.61 [ 0.38, 0.83]
Reference w d [95% CI]
![Page 31: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/31.jpg)
Moderator analysisEvolution across time
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● ●
0
1
2
3
4
2000 2005 2010 2015
Year of publication
Effe
ct s
ize
Maturation of the field?
![Page 32: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/32.jpg)
Moderator analysisExperimental design
No major difference(much more PP studies, somore confident result)
− −
0.00
0.25
0.50
0.75
1.00
1.25
ECPP PP
← confidence interval(95%)
← d(mean)
0 = no effect →↑ learning gains↓ loss/decline
↕ significantly differentfrom zero
![Page 33: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/33.jpg)
Moderator analysisParticipants: L2 proficiency
Beginners benefit more fromthese systems than advancedlearners
Very significant differenceand predictor(Q(df=3) = 10.8, p < .001)
− − − −−0.5
0.0
0.5
1.0
A1 A2 B1 B2
![Page 34: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/34.jpg)
Moderator analysisContext
No significant difference(p = .58)
Seems to be effective both in the school as the universitycontext (+ external, such as military, underrepresented).
−−−
0.0
0.5
1.0
1.5
2.0
Military School University
![Page 35: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/35.jpg)
Moderator analysisType of system
Goal-oriented systemsseem to be more effective globally.
−
−− −
−0.5
0.0
0.5
1.0
1.5
Form−focused Goal−oriented Narrative Reactive
![Page 36: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/36.jpg)
Moderator analysisSystem modality
Very similar effects, in bothmodalities.
− −
0.00
0.25
0.50
0.75
1.00
Spoken Written
![Page 37: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/37.jpg)
Moderator analysisSystem: Corrective feedback
Consistently with whatwe know about corrective feedback, systems providingfeedback are muchmore effective
− −−
0.00
0.25
0.50
0.75
1.00
Explicit Implicit No
![Page 38: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/38.jpg)
Moderator analysisOutcome modality
Higher effect on speaking
−−
0.00
0.25
0.50
0.75
1.00
Spoken Written
![Page 39: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/39.jpg)
Moderator analysisOutcome variables
−−−
−
−− −
−1
0
1
2
Accuracy Complexity Fluency Listening Proficiency Reading Vocabulary
More promising effects on fluency
![Page 40: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/40.jpg)
Dialogue-based CALL: meta-analysisSummary
Medium effect of dialogue-based CALL on L2 proficiency development! = .605 ***
Possibly differentiated effect depending on proficiency level, system modality & test modalityBut these observations still need to be confirmed by other studies
Need for more comparable designs, big enough samplesand precise instrumentsFuture research should inscribe itself in this emerging field and compare its results within the fieldMultilevel RE Model for all studies
−2 0 2 4 6
Standardized Mean Difference (g)
Rosenthal−von der Putten et al 2016Bouillon et al 2011Jia et al 2013Jia et al 2013Jia et al 2013Arispe 2014Arispe 2014Arispe 2014Arispe 2014Noh et al 2012Lee et al 2012Lee et al 2012Chiu et al 2007Harless et al 1999Kim 2016Kim 2016Kim 2016Wilske 2014Wolska & Wilske 2011Wilske 2014Wolska & Wilske 2011Hassani et al 2016Lee et al 2014Lee et al 2012Harless et al 1999Lee et al 2012Hassani et al 2016Wilske & Wolska 2011Wolska & Wilske 2010aWolska & Wilske 2010bWilske & Wolska 2011Wolska & Wilske 2010aWolska & Wilske 2010bBouillon et al 2011Petersen 2010Petersen 2010Rayner & Tsourakis 2013Hassani et al 2016Lee et al 2012Lee et al 2014
CALL−SLTCSIECCSIECCSIECLangbotLangbotLangbotLangbotPOMYPOMYPOMYCandleTalkConversimIndigoIndigoIndigo[Wilske2][Wilske2][Wilske2][Wilske2]IVELLPOMYPOMYConversimPOMYIVELL[Wilske2][Wilske2][Wilske2][Wilske2][Wilske2][Wilske2]CALL−SLTSashaSashaCALL−SLTIVELLPOMYPOMY
VocabularyVocabularyVocabularyVocabularyVocabularyVocabularyVocabularyVocabularyVocabularyVocabularyVocabularyPronunciationProficiencyProficiencyProficiencyProficiencyProficiencyFluencyFluencyFluencyFluencyFluencyFluencyFluencyComprehensionComprehensionComplexityAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracyAccuracy
mixedA2mixedmixedmixedB1A2B1A2NAA1A1A2
B1A2A1mixedmixedmixedmixedA2mixedA1
A1A2mixedA2mixedmixedA2mixedA2mixedmixedA1A2A1mixed
130104856372520252040212149
9212220
7676
102521
9211021
67
2067
10191912102125
−0.21 [−0.45, 0.03] 2.80 [ 1.56, 4.03] 1.10 [ 0.67, 1.52] 2.04 [ 1.58, 2.50] 1.78 [ 1.25, 2.32] 0.60 [ 0.03, 1.17] 1.20 [ 0.52, 1.87] 0.81 [ 0.24, 1.39]−0.41 [−1.04, 0.21]
1.35 [ 0.86, 1.83] 1.34 [ 0.67, 2.01] 1.90 [ 1.17, 2.63]
0.29 [−0.10, 0.69] 1.90 [ 0.79, 3.02] 0.77 [ 0.14, 1.39] 1.95 [ 1.23, 2.67] 4.15 [ 3.05, 5.26]
0.11 [−0.94, 1.16] 0.31 [−0.83, 1.45] 0.62 [−0.46, 1.69] 1.28 [ 0.04, 2.53]
0.05 [−0.82, 0.93] 0.48 [−0.08, 1.04] 1.87 [ 1.15, 2.60]
0.54 [−0.40, 1.48]−0.10 [−0.71, 0.50] 0.36 [−0.53, 1.24] 0.48 [−0.13, 1.10] 0.53 [−0.62, 1.69] 0.33 [−0.72, 1.39] 0.71 [ 0.07, 1.35]
0.95 [−0.24, 2.15] 0.50 [−0.56, 1.57] 1.02 [ 0.09, 1.95]
0.55 [−0.10, 1.19] 0.62 [−0.03, 1.27] 1.50 [ 0.60, 2.41]
0.13 [−0.74, 1.01] 1.37 [ 0.70, 2.04]
0.19 [−0.37, 0.74]
0.90 [ 0.51, 1.30]
System Outcome measure Prof. NReference Effect size (g) [95% CI]
(−Grammatical errors/words)
(Holistic rater judgement)
(−Grammatical errors/sentence)
(In−app response)
(Question−formation Test, morphology...)
(Question−formation Test, syntax score)
(Grammar/syntax test)
(Grammaticality Judgement Test)
(Grammaticality Judgement Test)
(Grammaticality Judgement Test)
(Sentence Construction Test)
(Sentence Construction Test)
(Sentence Construction Test)
(Nb of proper replies)
(undisclosed test)
(Defense Language Proficiency Test)
(Holistic rater judgement)
(Nb of words)
(−Phonation time/letter)
(Phonation time ratio)
(Phonation time ratio)
(Speech rate)
(Speech rate)
(TOEIC Speaking Test)
(TOEIC Speaking Test)
(TOEIC Speaking Test)
(Oral Proficiency Interview)
(Discourse Completion Test (holistic...)
(Holistic rater judgement)
(Holistic rater judgement)
(undisclosed test)
(Word Association Depth Test)
(Word Association Depth Test)
(Word Frequency Breadth Test)
(Word Frequency Breadth Test)
(undisclosed test)
(undisclosed test)
(undisclosed test)
(Translation test)
(Cloze test)
![Page 41: Insights from a multilevel meta-analysis on the ... · Bibauw, François & Desmet, 2015 (EUROCALL Proceedings); Bibauw, François & Desmet, under review. Dialogue -based CALL Typology](https://reader033.fdocuments.in/reader033/viewer/2022042810/5f9ef72253e4451ac83efec7/html5/thumbnails/41.jpg)
Thank you! Merci! Dank u! ¡Gracias!
Serge Bibauw Wim Van den Noortgateserg e.b ib auw @ kuleuven.b e w im .vand ennoortg ate@ kuleuven.b e
Thomas François Piet Desmetthom as.franco is@ uclouvain.b e p iet.d esm et@ kuleuven.b e