A new corpus for Spanish Second Language Acquisition Research

20
A new corpus for Spanish A new corpus for Spanish Second Language Second Language Acquisition Research Acquisition Research L. Dominguez, R. Mitchell, M. J. Arche L. Dominguez, R. Mitchell, M. J. Arche (U. of Southampton), E. Marsden (U. of (U. of Southampton), E. Marsden (U. of York), F. Myles (Newcastle U.) York), F. Myles (Newcastle U.)

description

A new corpus for Spanish Second Language Acquisition Research L. Dominguez, R. Mitchell, M. J. Arche (U. of Southampton), E. Marsden (U. of York), F. Myles (Newcastle U.). A corpus for L2 Acquisition. SLA theory aims to understand the complex mechanisms and conditions behind learner grammars - PowerPoint PPT Presentation

Transcript of A new corpus for Spanish Second Language Acquisition Research

Page 1: A new corpus for Spanish Second Language Acquisition Research

A new corpus for Spanish A new corpus for Spanish Second Language Acquisition Second Language Acquisition

ResearchResearch

L. Dominguez, R. Mitchell, M. J. Arche (U. of L. Dominguez, R. Mitchell, M. J. Arche (U. of Southampton), E. Marsden (U. of York), F. Southampton), E. Marsden (U. of York), F.

Myles (Newcastle U.)Myles (Newcastle U.)

Page 2: A new corpus for Spanish Second Language Acquisition Research

A corpus for L2 AcquisitionA corpus for L2 Acquisition SLA theory aims to understand the complex SLA theory aims to understand the complex

mechanisms and conditions behind learner mechanisms and conditions behind learner grammarsgrammars

Access to good quality data is crucial: Access to good quality data is crucial: learner learner productionproduction data + focused data + focused comprehensioncomprehension tasks tasks

Increasing interest in the creation of Increasing interest in the creation of electronic learner corpora:electronic learner corpora:– sharing data more easilysharing data more easily– automatising some aspects of data analysis automatising some aspects of data analysis

through the use of software such as through the use of software such as concordancers, part of speech taggers, etc. concordancers, part of speech taggers, etc.

Page 3: A new corpus for Spanish Second Language Acquisition Research

Some Existing Learner Corpora

– CHILDES: CHILDES: http://childes.psy.cmu.edu/

– TALKBANK: TALKBANK: http://talkbank.org/

– English Corpus Linguistics: English Corpus Linguistics: http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm

– L2 FRENCHL2 FRENCH FLLOC: FLLOC: www.flloc.soton.ac.uk/

– L2 (Written) SPANISHL2 (Written) SPANISH CEDEL 2: CEDEL 2: www.ugr.es/~cristoballozano/cedel2.htm

Page 4: A new corpus for Spanish Second Language Acquisition Research

SPLLOC SPLLOC “Spanish Language Learner Oral “Spanish Language Learner Oral

Corpus”Corpus” 2 year 2 year ESRCESRC funded corpus project investigating the funded corpus project investigating the

development of L2 Spanishdevelopment of L2 Spanish Aims:Aims:

– a small scale, high quality cross-sectional database of spoken learner a small scale, high quality cross-sectional database of spoken learner SpanishSpanish

– topics being investigated lie at the syntax/discourse interfacetopics being investigated lie at the syntax/discourse interface Data:Data:

Collected Collected - c40 hours of audio recordings (native/non-native)- c40 hours of audio recordings (native/non-native) - 80 written focused tests on word order- 80 written focused tests on word order - 60 computer based tests on clitic comprehension- 60 computer based tests on clitic comprehension

95% transcribed to date!95% transcribed to date!

Page 5: A new corpus for Spanish Second Language Acquisition Research

Immediate Research Immediate Research AgendaAgenda

Syntax/discourse interface as conceptualised Syntax/discourse interface as conceptualised

in generative linguistics, including:in generative linguistics, including:

– The acquisition of Spanish word orderThe acquisition of Spanish word order

– Clitic pronounsClitic pronouns

Verbal morphologyVerbal morphology

Development of the L2 lexiconDevelopment of the L2 lexicon

Page 6: A new corpus for Spanish Second Language Acquisition Research

Corpus DesignCorpus Design Balance of spontaneous and focused data (semi-Balance of spontaneous and focused data (semi-

spontaneous oral tasks are complemented by spontaneous oral tasks are complemented by focused judgement and production tasks)focused judgement and production tasks)

Balance of genres (semi-spontaneous oral tasks Balance of genres (semi-spontaneous oral tasks include interview, narrative and discussion)include interview, narrative and discussion)

Balance of participants (20 L2 speakers from each Balance of participants (20 L2 speakers from each of beginner, intermediate and advanced levels + of beginner, intermediate and advanced levels + NS speakers)NS speakers)

Flexibility of computer-aided analysis (use of the Flexibility of computer-aided analysis (use of the CHILDES system, plus an XML version)CHILDES system, plus an XML version)

Free web access to all materials (anonymised Free web access to all materials (anonymised sound files, transcripts, analysis files) for all sound files, transcripts, analysis files) for all bonafide research users.bonafide research users.

Page 7: A new corpus for Spanish Second Language Acquisition Research

Summary of tasks by type, elicitation Summary of tasks by type, elicitation methodmethod and genreand genre

Task Task TypeType

ElicitationElicitation GenreGenre Modern Modern Times Times

Loch Loch NessNess

PhotosPhotos Paired Paired DiscussioDiscussio

nn

Clitic Clitic ProductioProductio

nn

Clitic Clitic ComprehCompreh

ensionension

Word Word OrderOrder

SEMI -SEMI -SPONTASPONTANEOUSNEOUS

Oral Oral

NarrativeNarrative √√ √√ √√

InterviewInterview √√

DiscussionDiscussion √√

FOCUSEFOCUSED D TASKSTASKS

Oral Oral Production Production √√

Computer Computer basedbased

CompreheComprehensnsionion

√√

Paper Paper WrittenWritten

√√

Page 8: A new corpus for Spanish Second Language Acquisition Research

Some task samples

Page 9: A new corpus for Spanish Second Language Acquisition Research

Illustrations by Alex Brychta for “A Monster Mistake” by Roderick Hunt (Oxford Reading Tree, 2003) used by permission of Oxford University Press.

Loch Ness

Page 10: A new corpus for Spanish Second Language Acquisition Research
Page 11: A new corpus for Spanish Second Language Acquisition Research

Modern Times

Page 12: A new corpus for Spanish Second Language Acquisition Research

Photos task

Description of states

And

Description of events

Page 13: A new corpus for Spanish Second Language Acquisition Research

Clitic Comprehension (computer based)

The learner hears a sentence with a clitic pronoun and has to click on the object it refers to.

32 screens: Combination of number and gender (canonical and non-canonical)

plus syntactic collocation.• Canonical feminine: -a ending (e.g. calculadora ‘calculator’)• Canonical masculine: -o ending (e.g. teléfono ‘phone)• Non canonical: no –a/-o ending (e.g. lápiz)• Collocation: Proclitic (as in coniugated verbs) vs. enclitic (as in

infinitives).

Page 14: A new corpus for Spanish Second Language Acquisition Research

Clitic Production (computer based)

The learner is asked a question referring to an object based on the sequence of pictures shown.

32 slides; combination of number and gender (canonical and non-canonical) plus syntactic collocation.

Page 15: A new corpus for Spanish Second Language Acquisition Research

Word Order Task (paper & pencil) Context-dependent word order preference test

• The learner is presented with 28 situations with a following question

• Two types of questions: What happened? (Broad focus) Who did x? (Narrow focus)

• 4 items by 7 syntactic contexts:4xSVO, 4xVOS, 4xCLLD, 4xUnerg/Narrow, 4xUnerg/

Broad, 4xUnacc/Narrow and 4xUnacc/Broad • Three options: Inverted (VS), non-inverted (SV) and both.

1. You get home and your brother just tells you that he has got an email from your friend Sue and that he has very good news to tell you. You ask your brother “¿Qué ha pasado?” (What happened?)

What could he say? a. Se ha comprado un coche Sue b. .Sue se ha comprado un coche c. Both sentences (Sue has bought a car) (Sue has bought a car)

2. Your brother is having some friends over for a get together at home. When your mother comes she sees some smoke coming out of the bathroom and she asks your brother: “¿Quién está fumando?” (Who’s smoking?)What could you brother say? a.Oscar está fumando b. B. Está fumando Oscar c. Both sentences (Oscar is smoking) (Oscar is smoking)

Page 16: A new corpus for Spanish Second Language Acquisition Research

Summary of subjects by task Summary of subjects by task (to date)(to date)

Task Type Task Name University (Final Year)

Sixth Form College

(Year 13)

Lower Secondary

School (Year 9)

Natives (all ages)

Open-ended

Modern Times

20 5

Loch Ness 20 20 20 15

Photos 20 20 20 15

Paired Discussion

20 20 5

Focused

Clitic Comprehension

20 20 20 3

Picture Sequence

20 20 20 10

Word Order 20 20 19 20

Page 17: A new corpus for Spanish Second Language Acquisition Research

Tools for Data AnalysisTools for Data Analysis

CHILDES (The Child Language Data CHILDES (The Child Language Data Exchange System)Exchange System)– CLAN = Computerised Language AnalysisCLAN = Computerised Language Analysis

Computer program suite for transcribing, Computer program suite for transcribing, searching and analysing language datasearching and analysing language data

– CHAT = Codes for the Human Analysis of CHAT = Codes for the Human Analysis of TranscriptsTranscripts A format for notation and transcriptionA format for notation and transcription

Types of Analyses: Types of Analyses: – FREQ, MLU, COMBO, KWALFREQ, MLU, COMBO, KWAL

Page 18: A new corpus for Spanish Second Language Acquisition Research

Next StepsNext Steps Database will be available for use by the Database will be available for use by the

research community via research community via www.splloc.soton.ac.uk (in spring 2008) (in spring 2008)

Articles & conference papers (in 2007):Articles & conference papers (in 2007):– BAAL LLT SIGBAAL LLT SIG– GALAGALA– BUCLDBUCLD– HLSHLS– SLRF SLRF

CHILDES training workshop:CHILDES training workshop:– 25 January 200825 January 2008, University of Southampton., University of Southampton.

Page 19: A new corpus for Spanish Second Language Acquisition Research

The SPLLOC project is supported by an ESRC research grant (RES 000231609)

We would like to thank all the participants in the project, including subjects, transcribers and fieldworkers

Acknowledgments

Page 20: A new corpus for Spanish Second Language Acquisition Research

References Domínguez, L., Arche, M.J. 2007a. “Deviant optional forms in L2 Spanish: the case of word order

variation”. Poster presentation at GALA, Barcelona, 6-8 September. Domínguez, L., Arche, M.J. 2007b. “Optionality in L2 grammars: the acquisition of SV/VS contrast in

Spanish”. To be presented at BUCLD 32,Boston, 1-4 November. Domínguez, L., Arche, M.J. 2007c. “The L2 Acquisition of SV/VS contrast in Spanish”. To be

presented at the Hispanic Linguistic Symposium, Texas, 1-4 November. Domínguez, L., Arche, M.J., Mitchell, R, Marsden, E. and Myles, F 2007. “Innovations in Spanish SLA

research methodology: introducing the ‘Spanish Learner Language Oral Corpus’”. To be presented at the Hispanic Linguistic Symposium, Texas, 1-4 November.

Granger, S., J. Hung and S. Petch-Tyson (eds.). 2002. Computer Learner Corpora, second language acquisition and foreign language teaching. Amsterdam: John Benjamins.

Lozano, C. & Mendikoetxea, A. (in press). Verb-Subject order in L2 English: new evidence from the ICLE corpus. In: Actas del XXV Congreso Internacional de AESLA. Universidad de Murcia.

Lozano, C. & Mendikoetxea, A. (forthcoming 2007). Postverbal subjects at the interfaces in Spanish and Italian learners of L2 English: a corpus analysis. In: Papp, S., Díez, B. and Gilquin, G. (eds). Linking up contrastive and corpus learner research. Rodopi

Mitchell, R., Marsden, E., Domínguez, L., Arche, M. J. and Myles, F. 2007 “Creation and analysis of a Spanish language learner oral corpus (SPLLOC)”. Poster presentation at BAAL LLT SIG Conference “Towards a Researched Pedagogy”, University of Lancaster, 2-3 July.

Mitchell, R., Dominguez, L., Arche, M.J., Myles, F. and Marsden, E. “Developing a CHILDES-based corpus of L2 oral Spanish”. To be presented at Second Language Research Forum, Urbana-Champaign, 11-14 October.

Myles, F. 2002. Linguistic development in classroom learners of French: a cross-sectional study (No. End of ESRC award report R000223421). Southampton: University of Southampton.

Myles, F. 2005. Interlanguage corpora and second language acquisition research. Second Language Research, 21,4: 373-391.