LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions...
-
Upload
rita-hallas -
Category
Documents
-
view
215 -
download
0
Transcript of LEarning and TEaching Corpora: data-sharing and repository for research on multimodal interactions...
LEarning and TEaching Corpora:data-sharing and repository for
research on multimodal interactions
Ciara R. Wigham & Thierry ChanierClermont Université
LRL: http://lrl.univ-bpclermont.fr/ Publications: http://hal.archives-ouvertes.fr/LRL
PPT: http://edutice.archives-ouvertes.fr/edutice-00778274
1
4th WorldCALL Conference , 10-13 July 2013, Glasgow
2
Simuligne (2001)
UK-FR
fre
Copéas (2005)
eng
UK-FR
Tridem(2005-06)
UK-FR-USA
eng, fre
Ecofralin (2008)
CO-FR
fre,spa
VMT-teamC (2006)
math
UK-USA-SG
INFRAL (2009)
deu,fra
DE-FR
FR
FAVI (2006-08)
fra
ARCHI21 (2011)
eng,fra
FR
SLIC (2013)
USA-FR
fra
3
Data validity & reliability in CALL research?
•Problem in Social Sciences and CALL: ▫visibility, accessibility and validity of
research data▫data representative / anecdotal?▫no access to data when reading a publication▫links between data and publications
4
CALL data from online learning situations
• CALL data is often:▫not contextualised – pedagogical &
technological situations (Kern et al., 2004)
▫ tangled in specific software using proprietary formats
• Replication for interaction analysis in online learning near impossible:▫variables that are difficult to control▫replication does not imply that phenomenon
previously observed will reoccur (Reffay et al., 2012)
5
Open space for sharing research data concerning online multimodal interactions
Mulce project 2007-2010 & LETECMultimodal Corpora Exchange
6
Research data quality: Mulce project
•Interoperability:▫Structured and coherent data sets=> analyses can be completed by researchers
who did not participate in the course•Sustainability:
▫Independent from online platforms▫Stored in independent formalisms
•Open access to research data & appropriate licences
•Accessibility: ▫Finding the research data through standard
metadata – OLAC (Open Language Archives Community)
7
Learner Corpora / LETEC
• Learner Corpora (see Granger, 2002; Meunier et al., 2011)
▫SLA research▫ learners' productions▫ test situations (Reffay et al., 2008)
▫ learner- native speaker comparative studies (Boulton et al., 2012)
• LEarning and TEaching Corpora▫all participants considered (learners, tutors, etc.)▫ interaction data▫context
8
LETEC Components
Instanciation
Pedagogical scenario
Research protocol
Public licence
Privatelicence
Analyses
Context
"A LETEC corpus collects in a systematic and structured way all the data from interactions which occur during a course which is partially or entirely online. These data are enriched by technical, pedagogical and scientific information as well as information about the participants and are organized to allow contextualized analyses to be performed.“ (Mulce-documentation, 2013)
ethics &
rights
9
Methodology for building a LETEC
10
Staged process
stages=
Data analyses
11
Illustration of methodology-
• European project KA2 Languages
• CLIL approach (Content and Language Integrated Learning)
▫Architecture + French / English L2• Hybrid course "Building Fragile Spaces" : 5-day
studio Feb. 2011
• 17 students, 2 architecture tutors, 1 EFL tutor, 1 FFL tutor
Working with external partners: exchanges
12
Stage 1: Design
13
Elaboration of research areas•Interplay between verbal and non verbal
modes•Role of nonverbal in identity construction•Interplay between textchat & voicechat
modalities
Support for L2 verbal participation and
production
Wigham (2012) – PhD Thesis http://tel.archives-ouvertes.fr/tel-00762382
Stage 1: Design
14
Pedagogical Design
• Macro-task– collaboratively elaborate a model in a synthetic world (Second Life) as a response to an architectural problem brief
• Architectural studio, hybrid CLIL approach• 4 workgroups
Stage 1: Design
Learning design
Online environments
Participants’roles
Learning & support activities
15
Learning & support activitiesActivity Architecture objectives L2 objectives
Introduction to Second Life
Introduce students to multimodal nature of SL
Establish a communication protocol
Collaborative buildingactivity
Introduce students to building techniques to aid them develop their model
Develop L2 communication techniques concerning the referencing of objects
Group reflective session
Develop critical thinking by negotiation
Distinguish pertinent information for overall problem identification in their design brief
Help students to skill-up their L2
Acquire domain-specific vocabulary
Develop a professional discourse
Stage 1: Design
Detailed in: Rodrigues et al., in press; Wigham & Chanier, 2013
16
Research protocol•Research protocol design
▫Protocol for data collection▫Researchers' roles▫Timetable of research activities
Stage 1: Design
researcher
Wigham & Chanier, 2013 ReCALL
17
Stage 2: Data Collection
18
Data collection & coverage
Data collected
Pre-questionnaires
Session data Post questionnaire
s
Semi-directive
interviewsEnviron
mentKwiksurveys Second Life VoiceForum Kwiksurveys Skype
Data type
Spreadsheet file
Video screen captures
Audio recordings
Spreadsheet file
Audio recordings
Quantity &
coverage of data
17 student questionnair
es
20 group sessions & 2 presentation
sessions19h40m
64 forum messages
16 student questionnaire
s
5 student interviews
2h30
pre-course post-courseduring course
Stage 2: Data collection
19
Stage 3: Data Organisation,diffusion
20
Primary data (anonymised)
Each resources has an ID and a description given
LETEC global corpus: content packaging
Manifest : structured dataStructured Interaction Data Model (Mce_sid, 2011)XML Information about each component of the corpus
General metadata(OLAC standards)
Environnements used
Information on participants: language biographies and group organisationDescription of the environment, course length, participants, toolsActivities described in the pedagogical scenario
Stage 3: Data organisation
21
Corpus deposit•Mulce corpus repository (Mulce-repository,
2013)
Stage 3: Data organisation
22
Corpus diffusion• Description of corpus; interface to browse
structure; zip file to download
Stage 3: Data organisation
23
Stage 4: Transcription, analyses, publications
24
verbal mode non verbal mode
audio textchat
proxemic transmission
radio transmission
public private
not detailed here, see Wigham & Chanier, (2013)
ReCALL 25(1)
Multimodal data transcriptionStage 4: Data transcription & diffusion
25
Elaboration of transcription methodology• Characterized by communication modes &
modalities▫Systematic approach to studying online
environments• New environments = new modalities
▫Added to transcription methodology Communication
modeCommunication
modalityAct
type and transcription codeExplanation
verbalaudio
audio act (tpa) verbal turn in the public audio channel
silence (sil)interval between two audio acts greater than three seconds
textchat textchat act (tpc) message entered in the textchat window
nonverbal
proxemicsmovement (mvt)
avatar movement in the environment, e.g. avatar sits down, flies, walks backwards
entrance into /exit from environment (es)
avatar enters or exits the synthetic world
kinesics kinesic (kin)avatar gestures and movements made by an avatar's body part e.g. nod, point, clap
production production (prod)production or display of an object in the SL environment
Stage 4: Data transcription & diffusion
26
Multimodal transcription using ELAN
video screen capture
multimodal transcription aligned using timeline
participants &
modality
view of annotations for
one participant in one modality
Max Planck Institute for Psycholinguistics (2001). ELAN [software]. The Netherlands: Max Planck Institute for Psycholinguistics. [http://www.lat-mpi.eu/tools/elan/]
Stage 4: Data Analyses
27
Production & deposit of LETEC distinguished corpus
•Particular analysis of a selected part of the global LETEC corpusChanier, T. Saddour, I. & Wigham, C.R. (2012). (dir.) Distinguished Corpus: Transcription of Verbal and Nonverbal Interactions of the Second Life Reflection archi21-slrefl-av-j2. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21-slrefl-av-j2 ; http://repository.mulce.org]
•Only contains transformed data (=the transcriptions)
•Refers to a selection of the original data in global corpus (=videos)
•Software used for transcription cited (=ELAN)
Stage 4: Data transcription & diffusion
28
Why does structuring a corpus help analysis?•Common technical structures to hold
interaction data▫Data linked▫Analyses at different levels, in context
whilst maintaining a global view of the course
•XML structure allows standard forms of annotation / coding & different analysis software to be used▫Tatiana (2008)▫Calico (2009)
Stage 4: Research Analyses
29
An analysis example
• Interplay between textchat & voicechat
• Textchat modality acts in adjunct to the audio modality
▫ e.g. technical problems exist, opening & closing sequences of sessions (Liddicoat, 2011; Palomeque, 2011)
• Monomodal textchat environments – auto-correction, negotiation of meaning and corrective feedback
• Learner overload (Deutschmann & Panichi, 2009)
Multimodal environments ? (Hampel & Stickler, 2012)
Can the textchat serve for L2 feedback provision?
Stage 4: Research Analyses
Wigham & Chanier (in print) CALL Journal
An example of modality interplay
31
Characterisation of textchat functions
Wigham & Chanier (in print) CALL Journal
Stage 4: Research Analyses
32
Characterisation of textchat functions• Data coding facilitated by XML schemas
Stage 4: Research Analyses
33
Feedback in textchat
• 17% of acts contain feedback (49 acts)• Primarily concerns lexical and grammatical non
target-like forms (cf. Tudini, 2003)
• Predominant use of recasts (32/49 instances)
EFL Session
Technical
Socialisation
Conversation management
Task Form
Es-j3 3 7 9 41 17
Sc-j2 26 5 7 76 16
Sc-j3 2 9 4 36 16
Stage 4: Research Analyses
34
Results of textchat feedback study
• EFL tutor's strategic choice to use textchat - reduces cognitive load▫ Non expertise in content matter
• Language form Vs communicative meaning▫Recasts as remain in textchat window▫Recasts so as not to interrupt content
communication
• Students’ management of multiple modalities
Stage 4: Research Analyses
35
Publication of analyses & deposit of associated distinguished corpus
• Production of distinguished corpus:▫ Wigham, C.R. (2013). (dir.) Distinguished Corpus: Interplay between
textchat and audio modalities during the Second Life Reflective Sessions. Mulce.org : Clermont Université. [oai : mulce.org:mce-archi21-modality-textchat ; http://repository.mulce.org]
• Analysed data presented in parallel with results▫ Wigham, C.R. & Chanier, T. (in print). Interactions between text chat
and audio modalities for L2 communication and feedback in the synthetic world Second Life. CALL Journal
• Distinguished corpora can be cited in articles• Explicit connections between data and publications
enhance the quality of CALL research
Stage 4: Publication
36
Conclusion: Sustaining CALL research
• Reuse of data for cumulative or contrastive analyses▫ Rodrigues & Wigham (in print) – text chat &
problematic vocabulary points▫ Natural language processing techniques
• Facilitated by:▫ structured XML formalisms render online interaction
data autonomous from any platform, in tool agonistic form
▫ interactions described by modes & modalities -> not specific to an online environment
• Reuse of LETEC in corpus linguistics (TEI-CMC)
Conclusion
37
Perspectives•Documented and selected materials in
their original context –basis for reflection in pedagogical corpora
•Integration of pedagogical corpora into teacher-training classrooms
Conclusion
38
Contact: [email protected]
Website: http://lrl.univ-bpclermont.fr/
Mulce-documentation: http://mulce.org
Mulce-repository: http://repository.mulce.org
Thank you!
39
Corpus metadata• Inform researchers about:
▫ conditions under which the corpus was built▫ how to use the corpus▫ the corpus' content▫ licences for re-using the corpus
• Used for web harvesting▫ corpus become visible to whole community (OLAC, Clarin)▫ corpus can be cited
Stage 3: Data organisation
Characterisation of textchat functions
Analyses 40
• Data coding facilitated by XML schemas
Wigham & Chanier (in print) CALL Journal
Data coverage•6 sessions (3 FFL, 3 EFL)•4h30m of screen recordings
Analyses 41
Groups analysed Audio acts Textchat acts
EFL 450 423
FLE 386 64
Total GS-j2 Total GS-j3 Total GE-j3 Total GL-j3 Total GA-j2 Total GA-j30
50
100
150
200
250
300
Number of tpa acts Number of tpc acts
42
Perspectives•Documented and selected materials in their original context –basis for reflection•Inter-disciplinary project