Large-scale Similarity Join with Edit-distance Constraints ---BY Yu Haiyang 1/30.
E-1 Ai Haiyang
-
Upload
jose-suarez -
Category
Documents
-
view
217 -
download
0
Transcript of E-1 Ai Haiyang
-
7/31/2019 E-1 Ai Haiyang
1/28
A Corpus-based Study of Connectors: Research
from the CAS Learner Corpus of English Essays
Haiyang Ai, Gong Peng
Graduate University, Chinese Academy of Sciences
-
7/31/2019 E-1 Ai Haiyang
2/28
Outline of the talk
Introduction
Previous Studies
Methodology and Corpus Building
Results and Discussion
Conclusion and Pedagogical Implication
-
7/31/2019 E-1 Ai Haiyang
3/28
Definition of connectors
Connectors are devices used to state the
relationship between units of discourse
(Biber et al, 1999)
Including conjunctions, some adverbs (e.g.
firstly, namely, alternatively), and some
prepositional phrases (e.g. in brief, in fact,
of course)
-
7/31/2019 E-1 Ai Haiyang
4/28
Classification of connectors
Quirk et als (1985) framework
A Comprehensive Grammar of the
English Language
Adding ofcorroborative category
- (Granger & Tyson, 1996)
- (Altenberg & Tapper, 1998)
-
7/31/2019 E-1 Ai Haiyang
5/28
Quirk et als (1985) framework
listing
enumerative e.g. for a star, finally
additiveequative e.g. in the same way, likewise
reinforcing e.g. moreover, further
summative e.g. in sum, altogether
appositive e.g. for example, namely
resultive e.g. as a result, consequently
inferential e.g. therefore, in that case, otherwise
contrastive
reformulatory e.g. more precisely, rather
replacive e.g. better, againantithetic e.g. by contrast, instead
concessive e.g. in any case, however
transitionaldiscoursal e.g. by the way, incidentally
temporal e.g. in the meantime, meanwhile
-
7/31/2019 E-1 Ai Haiyang
6/28
Connectors investigated (68 items)
Listing:first, second, third, firstly, secondly, thirdly,finally, furthermore, in addition, moreover,lastly, last but not least, to begin with, foranother, in the first place, in the second place,
similarly, for one thing, for another
Summative:to sum up, to conclude, in summary, in short,in brief, in conclusion, overall, all in all,altogether
Appositive:that is, that is to say, in other words, forinstance, for example, namely, e.g.( eg),i.e.( ie)
-
7/31/2019 E-1 Ai Haiyang
7/28
Connectors investigated (68 items)
Resultive:consequently, hence, therefore, thus, as aresult, as a consequence, in consequence,
Inferential: otherwise, in that case
Contrastive:however, although, (even) though, on the otherhand, instead, after all, on the contrary, incontrast, besides, nevertheless, anyway, still,by contrast, nonetheless, alternatively
Transitional:meanwhile, eventually, subsequently, originally
Corroborative:actually, in fact, of course,indeed, apparently
-
7/31/2019 E-1 Ai Haiyang
8/28
Rationales to use corpus data
Corpus data are real and authentic =>empirical study
Combines intuitions of many, more
objective (McEnergy & Wilson, 2001)
Corpora are precious resources for testingout linguistic hypothesis (Meyer, 2002)
Learner corpus serves as the meetingpoint of corpus linguistics and SLA(Granger 1998)=> pioneer: Sylviane Granger, ICLE
-
7/31/2019 E-1 Ai Haiyang
9/28
Research questions
Whats the semantic distribution?
Whats the top 10 most frequently used
connectors?
Which connectors are overused?
Whats the differences and similarities
compared with related studies, and why(universal features vs. transfer-related?)
-
7/31/2019 E-1 Ai Haiyang
10/28
Hypothesis
Hypothesis:PhD students at GUCAS would overuse
connectors in their English writings
Formulated based on
Previous studies from HK and Taiwan(Crewe 1990, Field & Yip 1992, Milton &Tsang 1993, Bolton et al 2002, Chen
2006)
The authors own observation
-
7/31/2019 E-1 Ai Haiyang
11/28
Significance
Systematic and corpus-based connectorstudies on PhD students writing of inGUCAS => shed some light on theeverlasting cohesion & coherence
problems in ESL/EFL writing
Quantitative analysis can provide teachers(esp. at GUCAS) with a better idea onwhat needs to be done
The construction of the CASCLEEcomputer learner corpus itself (Resources)
-
7/31/2019 E-1 Ai Haiyang
12/28
Outline again
Approaching Connectors
Previous Studies
Methodology and Corpus Building
Results and Discussion
Conclusion and Pedagogical Implication
-
7/31/2019 E-1 Ai Haiyang
13/28
Previous corpus-based studies
Milton & Tsang (1993)
high ratio of overuse of entire range ofconnectors (HKUST vs. Brown, LOB)
Granger & Tyson (1996)
108 connectors, CIA method
overuse
-
7/31/2019 E-1 Ai Haiyang
14/28
Previous corpus-based studies
Bolton et al (2002)
Overuse exists in both groups, ICE-HK vs.ICE-GB
Raised 3 methodological issues
Chen (2006)
Latest, published on IJCL, Taiwanese EFLLearners
Slightly overused connectors
Increase learners register differences
-
7/31/2019 E-1 Ai Haiyang
15/28
Outline
Introduction
Previous Studies
Methodology and Corpus Building
Results and Discussion
Conclusion and Pedagogical Implication
-
7/31/2019 E-1 Ai Haiyang
16/28
Corpus building
Corpus name: CASCLEE - CAS Corpus ofLearner English Essays
Corpus Size: 494 essays, 120, 836 words,covering timed and untimed writings
Data analysis:WordSmith Tool 4.0 + Manual Extraction
Sampling & Representativeness
Learner Background & Register of text
-
7/31/2019 E-1 Ai Haiyang
17/28
Method: CIA
Contrastive interlanguage analysis(Granger 1996)
L2 vs. L1
L2 vs. L2
Reference corpora
Informative Writings of BNC SamplerCorpus (L1)
The ICLE French Subcorpus (L2)
-
7/31/2019 E-1 Ai Haiyang
18/28
Outline
Introduction
Previous Studies
Methodology and Corpus Building
Results and Discussion
Conclusion and Pedagogical Implication
-
7/31/2019 E-1 Ai Haiyang
19/28
Overall frequencies (normalised)
Overall Connector Usage
131.9
46.7
99.5
0
20
40
60
80
100
120
140
CASCLEE BNC Sampler-
Informative
ICLE-French
The Three Corpora
Per10,
000word
s
-
7/31/2019 E-1 Ai Haiyang
20/28
Semantic distribution
Semantic Distribution of Connectors in the Three Corpora
577.6
77.8116.7
84.4
18.2
322.8
11.6
110.1
44.7
4.7
53.675.0
8.1
192.8
25.062.7
116.1
28.4
196.5
137.2
12.5
264.7
14.2
225.9
0.0
100.0
200.0
300.0
400.0
500.0
600.0
700.0
listin
g
summativ
e
appo
sitio
nal
resulti
ve
inferential
contrastiv
e
tran
sitio
nal
corrob
orativ
e
categories
per100,
000w
ords
CASCLEE NF BNC Sampler-Informative NF ICLE-French NF
-
7/31/2019 E-1 Ai Haiyang
21/28
-
7/31/2019 E-1 Ai Haiyang
22/28
Quantitative difference: Overuse
Overused connectors
Group A (see Table 4)
Group B (see Table 5)
-
7/31/2019 E-1 Ai Haiyang
23/28
Comparing with related studies
Altenberg & Tapper (1998)Overuse offurthermore, for instance, still,of course (CASCLEE also)
Bolten et al (2002)overuse both exist in ICE-HK & ICE-GB
Chen (2006)
slightly overused
-
7/31/2019 E-1 Ai Haiyang
24/28
Major findings
PhD students overused a wholerange of connectors (hypothesissupported)
They significantly overused listingand summative connectors
Overuse of connectors exist both inCASCLEE and ICLE French subcorpus
-
7/31/2019 E-1 Ai Haiyang
25/28
Outline
Introduction
Previous Studies
Methodology and Corpus Building
Results and Discussion
Conclusion and PedagogicalImplication
-
7/31/2019 E-1 Ai Haiyang
26/28
Conclusion
Objectives and contributions
Build the CASCLEE learner corpus
Analyzing connectors based on Quirk etal (1985) framework
Methodology: contrastive interlanguageanalysis
L1 vs. L2 (CASCLE vs. BNC Sampler-info)
L2 vs. L2 (CASCLEE vs. ICLE-French)
-
7/31/2019 E-1 Ai Haiyang
27/28
Pedagogical Implication
Pedagogical implication
Focus on contrastive, resultive and appositionalconnectors, over 70%
Listing connectors should be addressed
Correct forms of connectors
Looking forward
More large-scale, corpus-based studies on EFL
learners connector usage
Probe into the possible causes for certain connectorusage patterns
-
7/31/2019 E-1 Ai Haiyang
28/28
The End !