Grammatical CarefulnessScale: Scale Development and Its ...
Transcript of Grammatical CarefulnessScale: Scale Development and Its ...
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish LanguageEducation
Foreign Language Grammatical CarefulnessScale:
Scale Development and Its Initial Validation
Kunihiro KUSANAGI
Junya FUKUTA Graduate SbhooL Aidgqya UitiversityJopan Sbcietyfor the P)"omotion ofStrience Yusakti KAWAGUCHI
Yu TAMURA
Aki GOTO
AI(ari KURITA
Daisuke MUROTA Graduate SbhooL Migaya Uhiver:sity
Abstract
This study aimed to develop and validate a scale to measure the Grarnmatical Carefulness
(GC) of fbreign language learners. GC, by its definition, refers to psychological, behavioral, and
meta-cognitive traits of a learner, and it entails highly controlled, cautious, analytical, and
time-consuming laiiguage use. By conducting a sct of questionnaire surveys targeting Japanese
jum'or high school, high school, and university students (N = 2,288), a Foreign Laiiguage
Gramrnatical Carefulness Scale (FLGCS) with 14 items, written in Japanese, was developed and
tested for its factorial stmcture, reliability, convergent, content, and criterion validity. The results
demonstrated that FLGCS yields three factors: (a) phonological, (b) lexical-syntactic, and (c)pragmatic carefulness, with a high reliability fbr each. The factorial validity was also supported byusing both exploratory and confirmatory factor analyses. Further, a set of analyses confirmed
various types ofvalidity. [[he evidence fbr the validity is as foliows: (a) the 1inguistic experts (n =
10) consistently judged that all the iterns properly referred to each factor in an appropriate
linguistic sense, (b) FLGCS showed correlations with learner beliefs, consisteni with theoreticalexpectations, and (c) FLGCS correlated to the scores of a C-test, and with the time to finish the
C-test. The applical)ility ofFLGCS in EFL teachng and research wi11 also be discussed,
1. Background
largeUndoubtedly,
grammatical perfbrrnance of a secondlforeigri language shows a relatively
variance among learners in comparison with that of their first language. Researchers in
77
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
secondrforeign language Iearning and teaching have thus attempted for a long time to discover
which factors explain this large variance in grammatical perforrnance among individnals. Second
language acqpisition (SLA) theories, fbr instance, have offered various frameworks relating to the
development of grammatical perfbrmance (e.g., Segalowitz & Segalowitz, 1993). Task-related
factors serve as another important contributor to the variance (e.g,, Tarone, 1985). Also,
behavioral and psychological traits within individuals, such as aptitudes, attitudes, motivation,
beliefs, and anxiety are yet more important factors, It is selievident that such factors interact in a
complex manner to predict one's grammatical perfbrmance, and they may jointly affect the
acquisition ofa second!fbreign lariguage.
It is a consideral)le challenge to cover all of these factors using just a single framewotk,
However, a concept which has been commonly adopted in the field of cognitive psychology and
psychological measurement is possibly one which captures the inter-learner variance of
gramrnatical perfbrmance; that is, Speed-Accuracy Tradeoff (SAT), SAT generally proposes that
task perfbrmance, with regard to various aspects, shows a very similar pattern, whereby faster
actions result in lower accuracy, while slower actions have a higher accuracy (e.g., Dennis &
Evans, 1996; Goldhammer & Kroehne, 2014; van der Linden, 2007). See plot (a) in Figure 1,which graphically represents this concept. Taking an example in the case of gramrnatical
perfbrmance, it can be said that ifatask is speeded up or timed, or the test taker is inahurry, they
may exhibit reduced accuracy. On the other hand, if the person doing the task can take enough
time to accomplish the task, helshe can plan and monitor histher language use deliberately. 'Ihis
basically leads to higher accuracy, This tendency of SAT may be common in many aspects of
language use and language assessment.
Theoretically, thc SAT pattern cxhibits functions as in (a) of Figure 1. Howeyer, it is
hypothesized that, in parallel to ski11 development, the funetions fbr SAT will also change, as in
(b) in Figure 1. [[he changes ofthe functions may correspond to some ofthe SLA theories, such as
ski11-acquisition theory or automatization (e,g,, Segalowitz & Segalowitz, 1993).
b98e
(a) {b)
'h,,'N,,,'1,`,,,...
bg8e
(c)
eqt!;tillli-.:-r-.-H-=----- XL"-":::::.-::J S-L 1 -LLu.-
-Ls--'--::
it9ge
hL;IL-hlRTittLlllxtLs-s
hsit;x:""----E)..
-L>tS--'-'-L--
hxXL-"ei--
s--
Speed Speed Speed
FVgure i, Schematic plots of the concepts in SAT. Plot (a) shows the basic tradeoff
pattem, Plot (b) explains the changes of the fUnctions caused by development. Plot (c)shows the inter-learner variance ofthe compromise points.
78
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
There is yet another important viewpoint in this framework, Irrespective of development,
irrter-learner variance still rernains, This can be captured by considering the compromise point of
SAT functions of individuals. Assume that one person at least in a specific situation tends to
prioritize accuracy, and another does not choose fbr accuracy, but speed, A part of such a variance
of compromise points among individnals can be determined by histher psychological and
behaviora1 traits. This may be responsible for the rest ofthe variance, as in (c) in Figure 1 ,
The present study calls this hypothetical trait Grammatical Carefulness (GC), which we will
view as a constmct in a psychometric sense.i The next section will introduce the concqpts of ([}C
and review some of the relevarrt studies in the literature. This article wi11 report the developmentand initial validation ofa scale to measure this new constmct, GC, in the latter sections.
2. Grammatical Carefulness in Foreign Language
2.1 Definition of Grammatical Carefulness
GC in foreign laiiguage refet;s to a behavioral, psychological, and rneta-cognitive trait of
individuals which is characterized by the fbllowing: (a) it entails highly cautious, carefu1,
deliberate, intentional, and analytical language use, fo) it promotes relatively slow,
time-consuming, and cogriitively demanding language use and leaming, which leads to a higher
accuracy oflearners with some grammatical tasks, and (c) it complexly links to other inter-learner
varial)les, such as aptitudes, attitudes, motiyation, beliefs, and anxiety.
The SAT framewotk regards GC as a moderator of the compromise points, In other werds,
it is hypothesized that someone with a higher GC tends to achieve higher accuracy at the expense
of speed, and another person with a lower GC tends to perfbrrn speedily and less accurately.
2.2 Grammatical Performance and Inter-Learner Variables in the Literature
A couple of previous studies attempted to reveal the relationships between inter-learnervariables and grammatical per[Ebrmance. For instance, Krashen (1978), in his early theoretical
wotk, suggested that there are two types of second laiiguage learnersi monitoFunder-usens and
monitor-over-useng (See also Seliger, 1980). Kormos (1999) extended this idea, and empirically
investigated the effects of the two different spealdng styles of individuals (aecuracly-centered and
fluenay-centerec() on their selfcorrection behaviors by observing Ll-Hungarian English learners'
speech production and questionnaire answers. Kormos looked at the interplays among the
speaking styles and the frequency of selfcorrection behaJvior; the aceuracy-centered participantsshowed higher frequencies of selfcorrection behavior than those with a fluency-centered style.
One other case is a recent classroom-based study condncted by Kartchava and Ammar
(2014), which investigated the effect of learner beliefs ahout corrective feedl)ack on noticing
behaviors and leaming outcomes. They reported that some beliefs mediated the frequencies of
noticing behaviors, but not the learning outcomes.
79
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
These studies focused only on very specific behaviors and situations of learners'
perfbrmance; Kormos (1999) is concerned with speaking, especially selfcorrection, while
Kartchava and Ammar (2014) are concerned with noticing, fbr instance. The studies picked up
very 1irnited learners' traits (speakmg styles and beliefs ahout corrective feedbacki. More critically,
some studies did not consider the measurement as constmcts. For instance, Kormos' study only
used five questionnaire items for determining the learnersJ speaking styles, of which reliability
and validity remained unclear, Kartchava and Arnmar's study, on the other hand, reported the
reliability and factorial stmctures, but ofcourse further validation would be desirable,
The present study takes a broader view, wnh some methodological sophistication regarding
the relationships between grammatical perfbrrriance and inter-leainer variables. GC is a trait of
individuals which directly links to grammatical performance in general, unlike beliefs regarding
some specific behaviors, GC also has value with regard to its relationships with other types of
inter-learner variables, such as beliefs. ln the latter section, fbr initial validation ofthe developedscale, we will report that the scales of GC are actually correlated to a part of grammatical
perfbrmance, and we show the theoretically plausible relationships with certajn types ofbeliefs.
2.3 Signhicance of GC in Theory and Practice
Estal)lishing GC as a psychological constmct and developing its reliable measures would bea promising way to shed light on various fields of study in the future. For instance, in
psycholinguistic experiments, it can be considered that controlling and establishing the variahles
of SAT-related inter-learner variance such as GC will be both theoretically and methodologically
lmportant,
In classroom-based studies, GC can be applied to measure moderators of the outcome of
students' leaming directly. Also, the impacts on GC of certain teaching metheds or treatments
would be interesting research topics. More specifically in practice, understating students' traits,
such as GC, can provide much information to teachers in the context of curriculum design, choice
ofteaching materials, and everyday teaching practice. A reliable, validated, and also easy-to-use
psychological scale is hence strongly desired.
3. Scale Development
3.1 The Preliminary Survey
in order to develop a psychological scale to measure GC, we condncted two questionnaire
surveys. The main purpose of the first tpreliminary) survey was the initial item selection for the
scale. In total, 169 students in two private universities participated, All of the participants were
first year students who took English classes. Their academic majors included economics and
education. The survey was canied out at the beginning of Apri1, 2014, The participants answered
the questionnaire during their English classes. The questionnaire consisted of <a) a face sheet, (b)
80
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
GC items written in Japanese (k = 40), as detailed below, and (c) anchor scales (k = 14, Tanaka &
Ellis, 2003). The surveys were conducted in the style ofa Likert scale, from one to seven points.
The initial item pool (k = 40) was created by the authors. By referring to learners'
retrospective data about lar)guage use in the literature, the authors composed the items in
consultation with each other. All the items in the initial item pool are availahle online at the first
author's website (see Appendix),
All the data were typed and were verified twice by the authors. The response of one
participant was excluded because ofsome missing values; the number ofvalid responses was l68.
Before the initial item selection by factor analysis, we excluded 18 items which obviously violated
the normal distribution, Since the goodness of fit indices of the initial exploratory factor analysis
(22 items were submitted), which extracted three factors, were unfavorahle, 7 items which caused
a misfit were excluded, using a step-wise exploratory factor analysis (SEFA). Hence, 15 items outof40 were selected for the secondary study. These 1 5 items can be seen in the Appendix,
3.2 The Secondary Survey
The secondary survey was undertaken from May to June, 2014, using the selected items (k= 15). In total 2,288 participants took part in the survey, and 2,098 answers, with no missing
values or extraordinary responses, were analyzed, The participants consisted ofjunior high schoolstudents sampled from two public schools (n =-
216), high school students (n - 1,078) from two
normal public schools, and university students from 1 1 national, public, and private universities (n= 804). Almost all of the university students were first year students and had various academic
majors. Junior high and high school students, on the other hand, were sampled in a well-balanccd
way in terms oftheir academic years.
As in the preliminary survey, the participants answered the questionnaire in their English
classes. The questionnaire consisted of the face sheet and the 15 items related to GC. The
secondary survey used a computer-readable questionnaire, Ihe data were automatically processedusing scanners and computers. Then, the authors yalidated the responses twice by hand,
Firstly, descriptive statistics of all the valid answers (n = 2,098) were calculated. Befbre
condncting factor analyses, we confirrned the distributions of all the responses (k = 15), Item No.
7 showed a strongly biased distribution, which may negatively affect the factorial stmcture. Hence,the item was excluded. Then, we conducted an exploratory factor analysis to determine theconstmcts ofGC. This study also perfbrmed confirrnatory factor analyses for the model in order to
confirm its factorial validity.
The distributions of the responses are graphically represented in the multiple histograms in
Figure 2. Tal)le 1 summarizes correlation coeMcients and the variance!covariance matrix of the
item responses (k = 14, excludmg item 7).
81
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglishLanguage Education
[teml ltem2 1tem3 1tem4 1tem5
)h 7lt )h )h h
.O'g・liMlb.
:g!ma S!IMk S:IM]} SEinlh.., IS57 1357 1357 1357 1357
Rate Ftare Rate Hate Rate
ltem6 lteme 1tem9 ltemlO
h h )h }.
iElg:n.. Slma iy'llMln- lk.!l:[!b.. 1357 135T 1357 1357
nste Mte date tete
1temM ltem12 ltem13 ltemd4 ltemrl5 S X )h pt X
SEI[!11)i [1!lma S':lua Ig・!ima i,-!in#I) 1357 la57 1357 1357 1357
fete Flate Rate Rate Rate
]Fligttre 2. Histograms representing the distributions of the responses.
Table 1,Cbrrelation CbEz(i7cients and P'Ziriancel(]ovariance Matrix ofthe ftem Responses,ItemNo.1 2 345689101112131415
123・4568910111213l4152.01
.49
,55
.50
.47
.43
.40
・op.44
.36
.36
.40
.40
.35
1.062.35
.41
51
.32
.37
,37
.33
.32
.39
.36
.39
.40
,37
1,Ol,811,66,59
,53
55
.43
.63
.63
.29
,35
.44
.42
,32
,991,091.061,93
,39
.40
.37
.48
.44
.26
,34
.45
.41
.31
,97
.731.00
.792.14
.37
.3S・"
.48
.29
.36
.39
.37
.30
,79.74.92.72.701.72.45.57.58.36.30.32.3528 .83
.83
.81
.75
.75
.g62.13
.55
,56
,36
,35
.38
,42
.39
.82.671.07.87.85.991,051.73,71,33.34.40.40.36 .82
.641.06
.80
,93
.991,081221.72
.30
.37
.38
.39
,31
.81
.95
,59
.57
,66
.74
,82
,68
.612.45
.63
.54
,57
,66
.79
.84
,69
,72
.81
.60
.78
.69
.741.502.32
.67
.68
.66
.Sl
.87
,81
,90
.82
.61
.80
.76
.721211.462.06
.72
.61
.84
,92
.81
.86
,80
,69
,92
.79
.771.331,561,552,25
.61
.76
,86
.62
.65
.67
.55
,86
.72
,631.591,541,341.402.32
7Vbte. Values on the left side represent correlation coeencierrts, right for covariance.
Table 2 presents the descriptive statistics, the surnmary of the fhctor analysis, and the
reliability for each factor. The factor analysis extracted three factors. The goodness of fit indices
were demonstrated to be not favorable, btrt at an acceptahle level, x2(52) ,== 285.44, p < .Ol, TLI
= ,93, RMSEA
= .08, and with a 90 % confidence interval (CI) [.07, .08]. Items No. 11 to 15
82
NII-Electionic
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
showed higher loadings fbr Factor 1, and these items were all related to the carefulness towards
the phonological aspects of grammatical perfbrmance. Items No, 6 to 1O loaded Factor 2 heavily,
and these oorresponded to the lexical-syntactic aspects, The rest of the items (from Item No. 1 to
5) showed relevance fbr the pragrnatic aspects. Hence, this study narned the factors phonolQgiealcarEzti(lnesty, lexical-syntactic carefuiness, andprtigmatic carE:tiiiness respectively,
Table 2.
Descriptive Statistics and the Regults ofthe Ex/ploratoiy jFkectorAnalysis
Descriptive statistics Pattern!Structme matrix
ItemM saSkewness KurtosisFactor 1Factor 2Factor 3Communality
131412151110968421353.573,823.844.083.103.263263.303,963.464,093.553203211.441,501,521,521,561.311.311,311,461.391,531.421.291,46O.24O,15O.18o.oo-O.02O.30O.30O,24O.06022-O.06O.27O.25O.40.O.40-O,53-O,59-O,63-O,78.O.09-O.11-O,16-O,45-O,31-O,65-O.32.O.08-O,28.87/.80.86/.81.801.80.751.83.70/.75-.OIL44.OIL49.04L41-.02f,44-.08L40-.05!.49-,10!.58-,11L44.12L49.22f,42,70.65.57,67,66
-.03f.44-.Ol!.46-,OIL40,18!.48.92L86.82/.83.601.6857L65
-.071.59.02f.62,11!.55-,04L50.74.69.47.44
-,07L44,16L48.04L47-.121.45,06L43-.04/.64-.131,42.09L55.421.75.29L56.841.77.621.62.591.77.54L77.32L57 .60Al.48.68.37
FactorCorrelationsFactor
2Factor 3
,55,63
,73
Reliahilitya
coesucients
Average correlation coeencients
.90,64 .84,57 .82.48
Sums ofsquares ofloadings
Proportionofvariance
Cumulative proportion ofvariance
3,28
.23
,23
2.68
.19
.43
2,16
J5
.58
?Vbte. The factor analysis was conducted using maximum 1ikelihood estimation method, and
Promax rotation, with the number of factors, three, as suggested by tlie parallel analysis, and we
judged that this model was also theoretically the most plausible.
83
NII-Electionic Libiaiy
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
The reliability of each factor was calculated with Cronbach's a coethcients and average
inter-correlation coecacients, The reliability coecacients of all the factors were sucacient, as can
be seen in Table 2.
The model was then submitted te confirrnatory factor analyses. All the paths to the observed
variables were statistically significant atp < .O1 , and the goodness of fit indices showed acceptable
levels, ln order to observe the differences in this factorial stmcture among the three groups of
participants, confirmatory factor analyses, using the same model, were condncted by dividing the
three groups (see Figure 3 fbr its path diagi;am). Table 3 sumrnarizes the comparison of the
goodness offit indices among the groups. The indices showed almost equal goodness offit among
the groups. Also, multiple sample stmcture equation modeling was used to detect the differences
arnong the groups. We tested four models: (a) "configural",
ofwhich the paths are equal among
the groups, fo) "weak measurement invariance", of which loadings are invariant, (c) a
"strong
measurement inyariance", of which loadings and intercepts are invariant, and (d) another type of
the former, ofwhich loadings, intercepts, and means are invariant, As the results demoilstrate, all
of the models showed a goodness of fit, as in Table 4. The fburth model, which was under the
strongest constraints, was the best model. Hence, it can be safely stated that at least the factorial
structure and its loadings were not invariant among the groups. This suggests that the scale which
the present study developed can measure the GC ofvarious levels oflearners.
Table3.
Cbmparison ofGoociness of]FVt indices among the Groups
(lroup n f of p CFITLI RMSEA SRMR
A!1Junior
high
HighUniversity
2,098 1,121.27
216 206.98
1,078 567.61
804 537.29
74747474<.Ol<.
Ol<.
Ol<. Ol
.94.93.94.92 .92.92.92.91 .08 [.07, ,08]
.09 [.08, .1 1]
.08 [.07, ,09]
.09 [.08, ,11]
.05.05.05.06
Table 4.
Sle(mmar;I? ofMeasurement invariance among the Muttipie Sbnrples
Model f ctf"pCFI RMSEA BIC
Configuralmodel
Weak measurement invariance model
(equal loadings)
Strong measurement invariance model
(equal loadings + intercepts)
St!!Qng-gigqsuig!ugn!-igya!iaggg-!ugdg!tr t dl
(equal loadings + intercqpts + means)
1,311.87
1,333.l8
1,345.09
-138045
222
2an
266
<.Ol .93
<.Ol .93
<,Ol ,93
Z2tZ2 <,Ol ..9.3
.05
.05
.04
=04
90,127.45
89,980.58
89,824.28
.8.N.....981385
84
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
.3s/.s4l.s41(・si}-)[iiEiiilll'N .so!.6s1.6s1(.7o)
.631.671.67/(.66}-!l!Ellll21- .61/.ss/.s7
.391.32!.31/(.33}p m .78/.82/.83
・4i1・4g1.s4/{.so}-.Il!Eliilll. ・77/・7i/・6s
,63!,6S/,60
.6o i .s7 / .64 / (・6o) -b[iiEiiil5]'
,451,52f,S4/(.51)p a .74/.S9/.68
.391,561.61!{.55)p a .781.661.63
,231,32f.31/{.3o}-[l!EiiiSl. .88f.82/.83!
,91/.80/.84l .161,3s/.3o!C3o) ltemlO
,4Sl,46f,48!C47) ltemll ,74/J3/.
.34!.33/.28/(.31) ltem12 .821.82/.
,331,311,37/(,33} ltem13 .s21.831,
.281,301.3s!C31) ltem14 ・85/.84/.
.81/.77/.801(.78) .351,411.36/{.39) ltemlS
.F7gure 3. Path diagram represeming the model in
! university 1 (all)". N= 2,098.
The results so far provided sufficierrt
psychological scale, foreign Language
which yields the three factors: phonological
pragmatic carefulness, For reference, the descriptive
All the scores exhibited a normal distribution.
.65/.S61.53/(.S7)
(.64)
question, with standardized estimates. The
standardized estimates for each group were shown in the form of `tiunior
high school / high school
empirical evidence for establishing the new
Gvammatical Ckerefulness Sbate (FLGCS, hereafter),
carefulness, Iexical-syntactic carefulness, and
statistics of the summated scale scores are
sumniarized in Tal)le 5. Phonological carefulness exhibited relatively higher scores than the others.
Table 5,
Descriptive Slatistias ofthe Shrmmated Sbale SZroregkM
sw Skewness Kurtosis
Phonological carefulness (item No.1 1 to 15)Lexical-Syntactic carefulness (item No. 6 to 1O)
Pragrnatic carefulness (item No. 1 to 5)
All
545143.883.443.503.621.271.111.llO.98O.14O.22O.14O.13-O.35O.06-0.35-O.13
Nbte, The summated scale scores here were the mean scores for the responses ofthe items fbr
each. The factor scores were not used here. n =
2,098.
85
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
4. Initial Validatien
For the mitial validation procedure, the present study further examined the content and
criterion validity of FLGCS by condncting multiple fbllow-up analyses. We tested the three
hypotheses below (hypothesis I to III),
H)/pothesis I: The contents ofall the items in FLGCS match the theoretical concepts ofeach
factor. For instance, it was hypothesized that the items fbr phonological carefulness actually refeT
to the phonological aspects of grarnmatical perfbrmance in linguistic terms.
thpothesis IZIi Each type of grammatical carefulness is correlated to learner beliefs with a
medium level of strength. More specifically, GCs are correlated to analytic beliefs (Tanaka &
Ellis, 2003) more strongly than to experiential beliefs.
Hilpothesis I[l: GCs are correlated to the accuracy ofa C-test, which is supposed to measure
general lariguage perfbrmance, and the time which test-takers take to complete the test. As
discussed in the Background section of this study, GC was considered as a type of moderator of
compromise points in the SAT framework; thus, it is assumed that semeone with a higher GCshould exhibit higher accuracy and lower speed in the task.
4.1 Content Validity
Ten 1inguistics experts voluntarily participated in this part of the study, Using an online
version of the questionnaire, we asked the participants to read the questionnaire items (k = 14)
carefu11y, then to select which type of grammatical performance the item refers to, in linguistic
temis, by choosing from one of four alternatives: (aj phonological, th) lexical-syntactic, (c)pragmatic, and (d) none ofthem. It was not allowed to skip an item.
The result was that all the participants answered that the items No. 1 to 5 referred to the
pragrnatic aspects of grammatical perfbrrnance, 6 to 1O the lexical and syTitactic aspects, and 1 1 to
15 the phonological aspects. This provides us with empirical support for the content validity of
FLGCS on a certain point.
4.2 Criterion Validity
4.2.1 Relationship with Learner Beliefs
In order to confirrn a part of the criterion validity (especially convergent and discriminatevalidity) of FLGCS, this study investigated the correlation patterns between FLGCS and two
types,oflearner beliefs, analytic and experiential beliefs (H/mpothesis M. Analytic and experiential
beliefs (AB and EB for each) were established by Tanaka and Ellis (2003). The ft)rmer type refers
to learners' beliefs which support analytical types of learning methods and their benefits, and
consisted of 7 questionnaire items (e.g., I can learn well by writing clown evetything in no?
notebooe, while the other type supports experiential ones, with 7 questionnaire items (e.g., I can
86
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
learn well Lly spealdng with otheKs in English). All ofthe items in the Japanese-translated version
are also available online (see Appendix). Theoretically, it can be expected that GC and beliefs
show some correlations, and GCs are related to analytic beliefs morc strongly than experiential
ones in terms of their conceptual relevance.
The data ef this section was cempared with that of the preliminary study, which included
both the GC questionnaire items and the learner belief items. [[hus, all the participants (168answers were used) were first year university students.
Firstly, the descriptive statistics and the reliability coethcients for each of the' summated
scale scores were calculated (see Tahle 6), The sarnple showed relatively higher experiential
beliefs, and a lower level of GCs than the results ofthe secondary study. It is possible to infer thai
the subsample had a tendency to support experiential beliefs preferahly and be less grammaticallycarefu1. We judged that the relial)ility ofeach score was sufficient (.73 to .91).
Then, a correlation analysis among the five summated scale scores was conducted. Figure 4
graphically sumrnarizes the correlation pattem and the disuibutions of the scores, We also used
classical mutti-dimensional scating (Crvfl)S, also known as principle cooTdinate ana4ysis; see
Coxon, 1982). CMDS is a statistical method to visualize the similarity ofvariables. Based on the
correlation coefficients matrix, CMDS can place each variable on a two-dimensional scale. Thus,
it can be interpreted that a pair of closer variables in the plot means that they have a higher
correlation, and more distant varial)les show lower correlations. Figure 5 shows the results of
CMDS.
Table 6.
Descriptive Sinttstias and Reliability ofthe Sle{mmated Sbale Sloores ofE[LGC:S andLearner Beli(:tS
k M saSkewness Kurtosis ct
Phonological carefulness (PH)Lexical-Syntactic carefulness (LS)Pragmatic carefulness (P)Analytic beliefs (AB)Experiemial beliefs (EB)
545 3.492.983,301.211.331.29 O.19
O.54-O,57
O.04O.16O.67.87,91,89
77 4.154.59O.95O,97O,13-O,33O.46O.52 ,73,74
IVlote. n == 168.
The results ofthe correlation analysis clearly supported Ilypothesis U; all ofthe GSs showed
low to middle levels of correlation coecacients, but more specifically they were more strongly
related to analytical beliefs, PH: r= ,63, with its 95% CI being [,53, .71], LS: r == .51 [.39, .61], P:
r =
.61 [.51, ,70], than to experiential ones, PH: r =
,33, with its 959'6 CI being [.19, ,46], LS: r= ,22 [.07, .36], P: r= .39 [.25, .51]. Also, as Figure 5 presents, all ofthe GCs were located closer
to analytic beliefs than to experiential ones. [rhis links perfectly to the conceptual relevance among
them.
87
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglishLanguage Education
1 3 5 7 1 3 5 7
tn{] ma Ei] tw [iSl l・ ieniilistl[IEI][l5i][l21]
'
ff pm kifim [IIEi] [EIIEI l' :- pa MM rm EII
r
papas[iiijllimal 1 3 S 7 I S 5 7 1 3 5 7
jF igure 4. Scatter plot matrix representing the
correlation coeracients on the upper side, the
histograrns in the middle columns, and the
scatter plets with 1inear regressions on the
lower side.
9-
:
g
8
:
R
P,(i]ilg
-1 ,O-O,5o,oO.51,O
Figure 5. Plot representing the distancesbetween the variables, based on their
correlation coeMcients matrix.
4.2.2 Relationship with the Performance of a C-test
This section wi11 report the results of the experiment which investigated the relationship
between GC and language perfbrmance. We assumed that GC as a moderator in SAT will show a
correlation with both the accuracy (score) and the speed (time to complete) ofa language test. The
present study focused on the perfbrrriance ofa C-test.
The number ofpanicipants was 77. All of the panicipants were first year university students,
The participants overlapped with the secondary study. After the secondary study, they participatedin a C-test (detailed below) as a part ofthe learning activity of their English classes, in June, 2014,All the participants were women. We also used the data about their GC, as determined in the
secondary study.
TIhe C-test was created by the authors (also avai1ahle on the authors' web page). The text
type was narrative (a letter to a writer's friend), The length ofthe text was 249 words, including
some blank words. The number of items folanks) was 17, which is equal to almost 7% of the
whole text. The readability scores ofthe text, ignoring the blanks, were 91 at Flesch Readmg Ease,
2.6 at Flesch-Kincaid Grade Level; these levels are usually regarded to be easy in・fbreigri
language reading studies. The examiner asked the participants to fi11 in the blanks in the untimed
condition (there was no time 1imt), but also asked the participants to report the time when they
had completed answering. A digital count-up timer was displayed on the monitor of the
classrooms, and the participants could note the time when they finished answering, using this.
Table 7 presents the descriptive statistics and the reliability fbr the smmated scale scores of
GCs, the scores of the C-test, and the time to complete the test. This subsample may harre shown
88
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
lower GCs in cornparison to the whole data of the secondary study, The reliahility coethcients
were acceptable, Figures 6 and 7 summarize the correlation patterns, as in the previous section.
Table 7.
Description of the Stimmated Sbale Sbores of GCs,C-7layt
the Slrows, and the 7Tme to Cbmplete the
k M swSkewness Kurtosisa
Phonological carefulness (PH)Lexical-Syntactic carefulness (LS)Pragmatic carefulness (P)ScoreTime
to complete (sec)
545 3.182.622.79O,97O.91O.87O.03O,20-O.08O.03-O.14-O.74.81.79.79
17n.a. 5,45491,34 2.35138.36O.58O.54 O.421.45,63n,a.
IVbte, n = 77.
135 04Slt
[ImuEi61ESIEillEIZI: i ilRili rm [il2i] [!!l] [III6]
F
tw Eiiill] [ii[iiN [illiE] [ilill r
iew ge ge [illl [ilii]
e
!llllll [kiii] [iiiiE] [liillll IZilill : 1SS lt34 ZIO Eco
jFVgure 6. Scatter plot matrix representing the
distributions and the correlations between thevariables fbr H)/pothesis ILIL
g・
gg-
g-
8g8
dy
S re
wy
-O.6 -O.4 -O.2 O.O O,2 O.4 O,6
jFVgure 7. Plot graphically representing the
results of CmoS.
The results supported I]5/pothesis llI. GCs are correlated to both the scores, PH: r = .35
[,21, .48], LS: r= ,41 [.28, .53], P: r - .38 [.24, .50], and the tmes, PH: r - .27 [,12, ,41], LS: r
= .36 [,22, ,49], P: r
= .31 [.17, .44], with low to middle levels for the coedicients. The results of
CMDS also suggest that GCs have a correlation with the score and the time, with almost the same
magnitudes. This means that GC links to both the accuracy and speed of language perforrnance,exactly as the framewotk of SAT expected,
4.3 Summary of the Initial Vafidation
Our initial validation provided infbrmatien regarding both the content and criterion validity
of the scale. The sumrriary of the results of our mitial validation, using a hypothesis testing
procedure, is shown in Table 8.
89
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
Table 8.
71Jie SIimmary pfthe Results ofthe initial Vbliciation
Hypotheses Content Results Evidence
I ThecontentsofalltheitemsinFLGCSmatch Supported Properly judged by 10
the theoretical concepts of each factor. Iinguistic experts
II GCs are correlated to analytical beliefs more Supported Showed the correlation
strongly than to experiential beliefs. pattern exactly as expected
III GCs are correlated to both the score ofa Supported Showed the correlation
C-test and the time to finish the test, pattern exactly as expected
5. General Discussion
The results presented al)ove lead us to conclude that the new psychological scale, FLGCS,
with its three factors (k = 14), is a statistically reliahle measure. Its stmctural, content, and criterion
validity were also supported by conducting multiple analyses. Funhermore, multiple sample
stmctural equation modeling demonstrated that FLGCS showed measurement mvariance among
the groups. However, FLGCS has a couple ofpotential limitations, as noted below.
Most importantly, regardless of its high reliability, FLGCS covers only a small area of GC
as a constmet. As inter-correlation and inter-factor correlation coefficients sigriified quite strongly,
the questionnaire items may measure very close behaviors and characteristics of individuals. This
phenomenon is called bandwidlrh:fidelity dilemma. However, since GC is a new concept, our
preliminary aim was to establish a reliable scale at the expense ofits coverage, in order to provide
a basis for further research. As in the Background section and the literature review of the present
study, the rationale for GC underlies the concepts of SAT, and the scale was mainly designed to
be applied in psycholinguistic studies, classroom-based studies, and teaching practice. Needless to
say, less reliable measures lead to attenuation problems, statistically, Hence, we presumed to
judge that a more reliable scale was preferable in this case.
Obviously, validation is not a dichotomous judgment and fUrther validation is always
strongly desired, A part of the evidence which the initial validation provided may cover only a
very small range of validity. Futurc studies should confirm the links between GC and other types
of individual diffbrences, the development of ([}C, and relationships with other types of
grammatical perfbrrnance (e.g., grammaticality judgment, sentence verification, and imprornptu
speech), Additionally, whereas the present study was a 1arge-scale survey, it never denies the
existence of sampling errors, Data with more varied and 1arger samples wi11 also be needed.
It should be noted that the present study failed to assess the feasibility and the consequential
aspects of validity. It wi11 be importarrt to analyze the washback efft)cts on leaminglteaching
behaviors oflearnersfteachers in practice.
90
Japan Society of English Language Education
NII-Electronic Library Service
JapanSociety ofEnglish Language Education
6. Conclusion
This study developed and validated a psychological scale to measure GC, which is related to
irrter-learner variance on SAT. SAT, a sophisticated framework conceming human behaviors,
may explain a large part of language perfbrmance, and GC, as an individual's trait, will be key tocapturing the dynamics of numerous varial)les concemed with language perforrnance. However,
the importance ofthis is not limited to theories ofsecondlfbreign language acquisition and use.
In teaching practice, FLGCS can also provide teachers mnch infbrmation about their students.
FLGCS will enable us to understand students' traits. It wi11 also help teachers specify what kmd of
grammatical carefulness (phonological, lexical-syntactic, and pragrnatic) of a panicular student is
(in)suthcient. The infbrmation wi11 contribnte to the everyday teaching practice of English by
playing various roles in the work ofteachers. Likewise, FLGCS will be usefu1, even fbr learners to
understand their own traits, This may promote leamers' selfregulated learning.
Notes
LThe terrn GC has numerous simi1ar terTns such as meta-lingtiistic awarenexy, language
awareness, language sensitivity, and grammatieal sensitivity, However, these generally refer
to one's knowledge, or certain types oflanguage-related ski11s, which are mainly measured by
language tests. We intended to refer to GC only as a psychological and behavioral trait, which
we consider to be fUndamentally separate from language knowledge or ski11s, However, wealso assume that they may be correlated to each other to some extent.
References
Coxon, A, P. M, (1982). 77)e userls guide to multidimensional scaling: Mith speeial rc!XZirence to
the MDS. London: Heinemann Educational Books.
Dennis, I., & Evans, J. St. B, T. (1996), The speed-error trade-offproblem in psychometric testing.
British .loumal ofRsychology, 87, 105-129.
Goldhammer, F., & Kroehne, U. (2014). Controlling individuals' time spent on task in speeded
perfic}rrnance measures: Experimenta1 tirne limits, posterior time limits, and response time
modeling. Al?plied Rsycholqgi'cal A4easurement, 38, 255-267,
Kartehava, E., & Ammar, A. (2014). Learners' beliefs as mediators ofwhat is noticed and learned
in the language classroom. 1:ES()L euarterly, 48, 86-109.
Krashen, S, D. (1978), Individnal variation in the use ofthe monitor. In W. Ritchie (Ed.), Sticond
language acquisition research (pp. 175-183). New Yotk, NY: Academic Press,
Kormos, J, (1999). The effect of speaker variables on the selfcorrection behaviour of L2
learners. Srgtem, 27, 207-221,
91
Japan Society of English Language Education
NII-Electronic Library Service
Japar ユ Society of English Language Education
Segalowiセ, N ,
,& Segalowitz
, S.」.(1993), Ski11ed pe曲 mance
, practice, and the differentiation
of speed −up 丘om automa 重ization effbcts : Evidence倉om second language word recognition .
ノ勿 1ガθ4勾 6乃o〜inguistics,/4,369−385,Seliger
, H .(1980), Utterance planning and correction behavior: Its ft ction l皿 the grammar
construction process ibr second language leamers. H .W . Dechert, M . Raupach (Eds.),
Tovvarcts a cro ∬ 伽 g istic a ∬ essment (抑 θε吻 ro ぬ o伽 (pp、87−99). Frank顛 , Germany :
Lang.
Tanaka, K .
,& Ellis
, R .(2003). Study abroad ,
language proficiency, and leamer beliefs about
language lean血 g. L 尻 乙τソ:ournal
,25
,63− 85・
Tarone, E .(1985). Variability in interlanguage use : Astudy of style−shift血g in morphology and
syntax . Language Leaming,35,373−404,van der Linden
, W , J.(2007). A hierarchical frarnework for modeling speed and accuracy on test
items. Psychometrika,73
,287− 308.
Appen 〔hx
Foreign Language(]ran 〃natical Carefalness Scale (FLGcs )
Item 1
(P)
ltem 2
(P)
ltem 3
(P)
ltem 4
(P}
Item 5
(P)
Item 6
(LS)
ltem 8
(LS)
外国語を使うと き,会 話の 流れ の 不 自然 さに
つ い て よく考える
外国 語 を使 うと き,表現が 文脈 に あわない と
考え こ ん で しまう
外国語を使うとき,一貫 して な い 表現 や 曖 昧
な表現に は よ く気 が つ く
外国語を使うとき,一貫 して い な い 表現 が あ
ると考え こ ん で しまう
外 国語を使 う とき,失 礼 な表現 や丁 寧過 ぎる
表現がよく気に なる
外 国語 を使 う とき、語 の 形の 変化の 誤 りに は
よ く気 がつ く方だ
外国語を使うとき,単語の つ づ りが 間違 っ て
い ると よ く気に な る
1tem 9
(LS)
ltem 10
(LS)
Item 11
(PH)
Item t 2
(PH)
Item 13
(PH)
item 14
(PH )
ltem 15
(PH )
外 国語 を使うとき,文章の 中で 間違 っ た単語
がある とよ く気がっ く
外国語を使うとき,単語の 間違 い には よ く気
づ く方だ
外 国語 を使うとき,発音が正確が考えるこ と
が 多い
外 国語 を使 う とき,い つ も発音が正 しい かど
うか気に な る
外国語を使 うとき,発音が 正 確で ない と考え
こん で しま う
外 国語 を使 うとき,発音 が 誤 っ て い る と気 に
なっ て しまうこ とが 多い
外 国語 を使うと き,発音が 本当に 正 確 か確 認
する こ とが ある
Note. Item 7 was deleted(see the section concerning scale development). Note, just for reference ,
Item 7:“外国語を使うとき,文法規則に 合わな い表現 に よ く気がつ く
”. P =pragmatic carefUlness ,
LS =
lexical−syntactic carefUlless , PH =
phonological carefUlness . Supplementary data including
(a)the mitial item pool,(b)the C−test, and (c)the final version of the questio皿 aire used in
the present stUdy , are available at the lilst author
’swebsite :
hゆ s:1/sites.google.com /site!kUsanagikt ni!home!pr(}jects/gc
92
N 工工一Electronic Library