Dissertation 007

Dissertation

Introduction

So my research question is, does Gunther Breaux’s conversational fluency focused method

for teaching Korean university students provide a suitable approach for teaching communic-

ative English skills?

1

Introduction 2

A statement about communicative approaches:

According to an article reported in the Korea Times (Kwon, 2010), in 2009 Korea’s rank on

the TOEFL iBT speaking test was 121st, whereas its composite rank put it at 77th. Koreans

scored above average on the reading, writing, and listening portions of the test, however fell

below average on the speaking section. As Li notes throughout her study (1998) of Korean

teachers using Communicative Language Teaching (CLT), an over reliance on grammar

grammar based examinations, and a tradition of audiolingual and grammar-translation teach-

ing methods have stymied an uptake in CLT. All of Li’s study participants fault the country’s

English section of the National University Entrance Examination for making grammar, read-

ing comprehension, translation, and listening comprehension the primary measurement.

In Korea, CLT adoption was decreed by the government starting in 1995 (Li, 1998), however

as Li’s study makes apparent, Korean teachers, and presumably non-Korean teachers,

struggled to implement it for a variety of reasons. Furthermore, Li’s findings confirm much

of what Harmer suggests (2007, p.76) are some difficulties in applying any methodology out-

side of America or England. He contributes some blame to western ideals of learning. Liu

defines these ideals (1998):

← In the West, teaching is process or discovery-oriented. Interaction, group work, and student-centred-ness are the order of the day in classes whose normal size is under 20 students, and which usually have far more resources than their Asian equivalents.

This runs in stark contrast to the Holliday’s description (in Liu, 1998) of the teacher (in Asian

countries) as purveying knowledge to students who are expected to absorb and retain it. An

additional complication, Kachru reasons (in Liu, 1998), is that there has been an over reliance

on data collected on Second Language Acquistion from studies conducted in North America,

Britain, and Australia. What is needed he argues, are more studies conducted world-wide.

Differences in methodologies and educational cultures are not the only issues plaguing CLT

and other non-teacher-centered methodologies in Korean classrooms. Creating tests to meas-

ure communicative ability continues to prove difficult. Savignon illustrates the importance of

2

, 21/01/14,

Kwon, M. (2010) ‘Koreans Poor in TOEFL Speaking’, The Korea Times. 2 April [Online] Available at: http://www.koreatimes.co.kr/www/news/nation/2010/04/117_63548.html (Accessed 27 December 2013).

, 21/01/14,

Liu, D. (1998) ‘Ethnocentrism in TESOL: Teacher education and the neglected needs of international TESOL students’, ELT Journal, 52, (1), pp. 3-10.

, 21/01/14,

Liu, D. (1998) ‘Ethnocentrism in TESOL: Teacher education and the neglected needs of international TESOL students’, ELT Journal, 52, (1), pp. 3-10.

, 21/01/14,

Li, D. (1998) ‘"It's Always More Difficult Than You Plan and Imagine": Teachers' Perceived Difficulties in Introducing the Communicative Approach in South Korea’, TESOL Quarterly, 32, (4), pp. 677-703.

testing (in Li, 1998), observing that curricula changes have suffered due to unsatisfactory

evaluations. Even determining factors, such as validity and reliability, essential in measuring

a test is up for debate (Fulcher, 2000). Establishing the type of knowledge to test, Bachman

sums up communicative competence, “...in addition to the knowledge of grammatical rules,

the knowledge of how language is used to achieve particular communicative goals, and the

recognition of language use as a dynamic process” (1990, p.83). While Bachman’s explana-

tion is broad, Fulcher, with the help of Morrow, boils down the requirements of communicat-

ive testing (2000):

← 1. Performance: test-takers should actually have to produce language. ← 2. Interaction-based: there will be actual “face-to-face oral interaction which involves not only the

modification of expression and content...but also an amalgam of receptive and productive skills” (Morrow, 1979, p. 149). (Do I need to paraphrase this since it’s a secondary quote?)

← 3. Unpredictability: Language use in real-time interaction is unpredictable.←

He goes on to raise the issues of standardizing assessment and scoring, as well as the need for

more research into measuring performance from the reliable use of integrated tasks. Despite

these drawbacks, Fulcher defends the authenticity of communicative testing (2000):

← 1. Purpose: the test-taker must be able to recognise communicative purpose and be able to respond appropriately.

← 2. Authenticity: input and prompts in the language test should not be simplified for the learner.← 3. Context: language will vary from context to context; some talk will be appropriate in one context

and not another. The test-taker must therefore be tested on his/her ability to cope with the context of situa-tion (physical environment, status of participants, the degree of formality, and expressing different atti-tudes), as well as the linguistic context.

If Communicative Methodologies (CM) are to be received by Korean educational culture,

then progress needs to be made in testing.

To address the shortcomings, Gunther Breaux has developed what he calls a Conversation-

Based English (CBE) methodology to improve speaking fluency. His methodology draws on

Communicative Language Teaching (CLT), the Natural Approach, and Cooperative Lan-

guage Learning (CLL) as described by Richards and Rodgers (2001, p.153-200). Fluency, Br-

eaux (2013) suggests, is the perquisite to communicative speaking. Teaching accuracy first,

he argues, creates barriers to learning by interfering with human nature, and limiting oppor-

tunities for students to speak. To motivate students to speak, Breaux claims that, “What gets

tested, gets done” (2013). Fulcher seconds this notion (2010, p.1-2) in writing of Latham’s

and Ruch’s observations about tests as motivating devices. Testing for conversational speak-

3

, 21/01/14,

Fulcher, G. (2010) Practical Language Testing. London: Hodder Education.

, 21/01/14,

Breaux, G. (2013) Holy Grail: A Classroom Test That Both Measures and Improves Speaking Ability. [Presentation to Soonchunhyang University Foreign Language Staff]. 17 April.

, 21/01/14,


, 21/01/14,

Richards, J. C. and Rodgers, T. S. (2001) Approaches and Methods in Language Teaching. 2nd edn. Cambridge: Cambridge University Press.

, 21/01/14,

Fulcher, G. (2000) ‘The `communicative' legacy in language testing’, System, 28, (4), pp. 483-497.

, 01/21/14,

Do I need to paraphrase this?

, 21/01/14,


, 21/01/14,

Bachman, L. (1990) Fundamental Considerations in Language Testing. Oxford: Oxford University Press.

, 21/01/14,


, 21/01/14,

Li, D. (1998) ‘"It's Always More Difficult Than You Plan and Imagine": Teachers' Perceived Difficulties in Introducing the Communicative Approach in South Korea’, TESOL Quarterly, 32, (4), pp. 677-703.

ing is Breaux’s solution to improving fluency and accuracy, concurrently. Li’s study parti-

cipants certainly support this notion.

Hence, this study aims to test the theories put forth by Breaux, by evaluating his tests as per a

variety of factors, and use the data collected from these tests as a means of measuring per-

formance gains, test validity and reliability, and methodology effectiveness. Attempts were

made to replicate the conditions that Breaux describes (2013). Classes consisted predomin-

ately of first year Korean university students, and when possible, Breaux’s materials were

utilized, along with his prescribed teaching and testing methods. This research should be of

value to teachers working in Korea, as conversational classes have been mandated by Korea’s

National Education Curriculum (Kwon, 2000).

Literature Review:

To begin addressing CBE, it is best to start by covering the states of CLT, the Natural Ap-

proach, and CLL, and identifying the aspects that CBE utilizes. CLT as, Richards and

Rodgers surmise , is founded on several principles:

- Learners learn a language through using it to communicate.- Authentic and meaningful communication should be the goal of the classroom activities.- Fluency is an important dimension of communication.- Communication involves the integration of different language skills.- Learning is a process of creative construction and involves trial and error. (2001, p.172)

These principles are cited, nearly verbatim, as underlying CBE, although Breaux did not put

them so succinctly (Should Breaux’s presentation be addressed as a past event, or is

there a citation that I should use for referencing his powerpoint?). He has also modelled

the application of his methodology on Johnson’s and Johnson’s (1999) five characteristics of

‘standard’ communicative methodology (CM). Practitioners of CM can choose to favour ‘the

appropriate,’ over grammar instruction, or to paraphrase Hymes from Johnson and Johnson

(1999, p.69), rules of use trump grammar rules. They explain that initial CM sought to

provide L2 learners with concepts that Wilkins defines as (in Richards and Rodgers, 2001,

p.154) notional (‘...time, sequence, quantity, location, frequency’) and functional (‘requests,

denials, offers, compalints’). Johnson and Johnson note that syllabuses shaped primarily by

4

, 21/01/14,


, 21/01/14,

Johnson, H. and Johnson, K. (1999) Encyclopedic Dictionary of Applied Linguistics : A Handbook for Language Teaching. Oxford. Blackwell Publishers Ltd.

, 21/01/14,


, 21/01/14,

Should Breaux’s presentation be addressed as a past event, or is there a citation that I should use for referencing his powerpoint?

, 21/01/14,


, 21/01/14,

Kwon, O. (2000) ‘Korea’s English Education Policy Changes in the 1990s: Innovations to Gear the Nation for the 21st Century’, English Teaching, 55, (1), pp. 47-91.

these concepts tended towards what Cook calls ‘guided role play’ (2008, p.249). Unlike Au-

dio-Lingualism’s gap-filled predetermined dialogues, Cook explains, CM merely establishes

a setting or scene in which learners must use their knowledge of the L2 to perform relevant

output.

Message-focus, the second characteristic of CM, as Johnson and Johnson (1999, p.69,70) ex-

plain, is considered to be the basis of standard CM1. Language, they write, is meant to purvey

information, not simply reflect grammatical rules. Exercises, the two explain by referencing

Widdowson’s work on the topic, should therefore promote use over usage. Richards and

Rodgers (2001, p.173) simplify this as an emphasis on information sharing and transfer, for

example by creating messages (output) and comprehending messages (input). CBE’s

primary mode of instruction, the daily conversational groups of two or

three students aided by sets of questions, could be seen as Breaux’s inter-

pretation of Brumfit’s Deep-End Strategy. Differing slightly from Brumfit’s

strategy, Breaux provides the items first, but rather than drill, instructs

directly as needed.

The third characteristic, the psycholinguistic process of communication, Johnson and Johnson

(1999, p.71) explain, is “...the user’s desire to convey a message,” a notion shared with mes-

sage-focus. According to Johnson and Johnson, the psycholinguistics process explains how

learners comprehend messages:

← Psycholinguistics provides us with the insight that listeners process selectively, not attending equally to every word of a message. Unlike traditional listening comprehension exercises in which the learner is made to focus on each word, information transfer requires the learner to attend only to those parts of the message relevant to the task set. (1999, p.71)

Johnson and Johnson also emphasize how the process’s top-down nature of input comprehen-

sion leads the listener/reader to process input using background knowledge to interpret in-

puts. Hedge, they explain, ‘...divides this background knowledge into general knowledge,

subject-specific knowledge, and cultural knowledge’ (1999, p.78). These aspects of interpret-

ative comprehension, Johnson and Johnson continue, are the result of explorations in cognit-

ive psychology, specifically, schemata, frames and scripts. Examining these fine

1 There are multiple interpretations of CM, however the standard definition appears to be the most commonly accepted, and is the focus of this study.

5

, 21/01/14,


, 01/21/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,

Cook, V. (2008) Second Language Learning and Language Teaching 4th edn. London: Hodder Education

points are beyond the reach of this study, but a thorough discussion of

these concepts can be found in Brown and Yule (1983).

Richards and Rodgers (2001, p.173) define risk-taking, the fourth characteristic, as learning

from mistakes in order to extend one’s knowledge by learning to employ multiple communic-

ation strategies. Although, as Johnson and Johnson (1999, p.71) recount, educators desire to

prevent errors, which the account for a holdover from behaviourist learning theory, and thus

hindered learners from developing from their mistakes. The fault of such instruction, whereby

learners are taught to deduce meaning by focusing on each word rather than using context

and partial understanding to comprehend, they argue, is that it fails to provide learners with

an essential communication skill. Alternatively, the deep-end strategy coined by Johnson (in

Johnson and Johnson, 1999, p.70), aims to put learners into production activities. However,

this may adversely affect learners and lead them to communication strategies like avoidance

strategies, which Cook (2008, p.107) explains, leaners adapt to avoid topics or words due to

their inherit difficulties. Varying activities so that students are encouraged to

confront these difficulties can be used to address the issue without direct

intervention that can harm learner confidence.

Free practice, the final characteristic, skill psychologists explain carries importance because

learners are expected to perform several sub-skills simultaneously (Johnson and Johnson,

1999, p.71). Speaking execution must satisfy several parameters, (like grammar, phonology,

semantics, etc...) simultaneously for fluent exchanges. This skill cannot be drilled into

learners who then engage in part practice activities. Therefore, Johnson and Johnson (1999,

p.72) explain, CM places considerable weight on free practice.

Native Approach

Newmark and Reibel (in Richards and Rodgers, 2001, p. 190) contend that adults are capable

of grasping unordered grammar concepts, and that doing so has the potential to lead to native

level mastery. This aspect of the Natural Approach, Richards and Rodgers explain, is just a

step in the progress of the Communicative Approach. What makes the Natural Approach sig-

nificant is its focus on comprehension and meaningful communication over grammatical cor-

rectness. CBE methodology likewise favours fluency over accuracy. However, the Nat-

6

, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


ive Approach lacks real-world testing due to its theoretical basis. It is largely

a framework of hypotheses created by Krashen (Richards and Rodgers, 2001, p.181-183),

summarised here:

• The Acquisition/Learning Hypothesis: learners comprehend and communicate mean-

ingfully in their L2, just as they did when naturally acquiring their first language (L1). Con-

versely, learning is the result of formal teaching of the rules and forms of the second lan-

guage (L2), but does not lead to acquisition.

• The Monitor Hypothesis: learners use the forms garnered from the learning hypo-

thesis to consciously correct themselves. Self-correction is dependent on time (allowing

enough time to apply rules), focus on form, or FonF, (correct forms should be exaggerated

to promote use), and knowledge of rules (specifically an awareness of rules that can be eas-

ily explained and formed).

• The Input Hypothesis: the hypothesis, which is concerned with acquisition, suggests

that learners acquire language in levels, which should only be one level above their current

proficiency. Acquisition arises from context clues, common (world) knowledge, and other

non-linguistic factors. Fluency develops over time as the learner constructs linguistic com-

petencies from comprehension of inputs. Subconsciously, the speaker, likely the teacher,

will attune to the learner’s comprehensible input level (understanding of input gleaned from

situational and extralinguistic information), so as to find the path of least resistance to com-

prehensible communication.

• The Affective Filter Hypothesis: consists of motivation, self-confidence, and anxiety.

Learners that are able to lower their filters, will be more highly motivated, self confident,

and will lack anxiety. Younger learners are most likely to have a low affective filter.

However, the theory underlying the Native Approach has come under siege. Leveraging a

body of criticisms, Zafar (2010, p.140) suggests that too many factors contribute to an L2

learners likeliness of attaining native like proficiency. He cites McLaughlin’s and Gregg’s as-

sessments of the Language Acquisition Device, or LAD, (the subconscious way learners ac-

quire new language). They challenged LAD’s effectiveness in learners post-puberty. Zafar

counters their assertions by citing writer Joseph Conrad as an anecdotal example of an adult

learner reaching native proficiency. This, he suggests, may provide justification for modify-

7

, 21/01/14,

Zafar, M. (2010) ‘Monitoring the 'Monitor': A Critique of Krashen's Five Hypotheses’, The Dhaka University Journal of Linguistics, 2, (4), pp. 139-146.

, 21/01/14,


ing the original understanding of LAD. Zafar also touches upon confusion in terminology that

plague Krashen’s theory, but are of little concern to this study.

Concerning the Monitor Hypothesis, Zafar (2010, p.141-142) takes issue with the immeasur-

ability and the hands-off nature of the three factors. He also complains of the theory’s inabil-

ity to address “difficult rules,” which Krashen and Teller wilfully concede (in Zafar, 2010,

p.141). Zafar references Gregg’s (2010, p.142) anecdotes concerning comprehension occur-

ring in learning devoid of acquisition. McLaughlin, Zafar continues, suggests that acquisition

alone would lead to haphazard meaningless production. He goes on to contend that children’s

lack of the Monitor does not explain how study results found that adult learners were able to

attain L2 from learning. Zafar notes that the Monitor is missing such empirical evidence.

Holes persist in the Natural Order Hypothesis as well. Zafar claims that the route to attaining

a target language is unpredictable, a fact, he mentions, that McLaughlin has proven (in Zafar,

2010, p.142). Krashen’s implications of a natural order are further compromised by the fact

that his claim is based an already disproven English morpheme study as found by Gass and

Selinker, and McLaughlin (Zafar, 2010, p.142). Another aspect lacking in this theory that Za-

far points out is the influence of learners’ L1, citing research by Wode and Zobl that indicates

that some learners will benefit from their L1 more than others. The reverse is also true in

terms of some learners’ L1 detracting from their acquisition. Zafar sees this oversight as an

oversimplification of Second Language Acquisition research to narrow the scope of his own

theory so as to ignore individual L2 learning needs.

Again, Krashen’s emphasis of acquisition and a dearth of evidence opens his Input Hypo-

thesis to criticism. Zafar points to Gregg’s argument that acquisition develops out of extra-

linguistic information as leading to guesswork in comprehension. This may or may not lead

to acquisition, as some grammatical rules are likely to be, in order of magnitude, greater than

one level. Krashen (in Zafar, 2010, p.143) addresses this criticism by suggesting that teach-

ers should be able to observe learners competencies levels and provide necessary grammar

instruction with a “sufficient amount” of “comprehensible input.” Once again, Krashen’s fail-

ure to explain his own terminology provides no opportunities for measurement and leaves

only room for criticism.

8

, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


In addressing the shortcomings of the Affective Filter Hypothesis, Zafar finds considerably

less to dispute. He takes issue with Krashen’s broad assertion that all children have low af-

fective filter, arguing that some children are likely to be affected by personal factors that also

trouble their adult counterparts (2010, p. 144). He argues that this hypothesis does not explain

how some adult learners are able to overcome these factors to obtain native-like proficiency.

Using Gregg’s anecdote about a Chinese woman with near native proficiency, he explains

(2010, p.144), that Krashen’s hypothesis is unable to explain why some aspects of the L2 are

absorbed, and how the filter impacts fossilisation and interlanguage.

An ongoing focus of Krashen’s critics, is that of semantics. His hypotheses were clearly made

by a person speaking from experience. By putting studies behind the claims in the hypo-

theses, Krashen could silence many of his detractors. It seems inevitable that such studies

would help pear down his claims about his hypotheses. Krashen’s greatest transgression in

publishing his hypotheses, is that he continues to stand by them without offering concrete

data as a buttress against his assailants. Then again, many of the criticisms levelled against

his theories are done so with anecdotes. Krashen has merely made it easier for his critics to

do so by not providing evidence based on research.

It would be extreme to deem the Native Approach’s founding theories as trite. Understanding

that it is merely a debate over theory, and with very little applicable measures being applied

on either side, the discussion remains ongoing. Additional study would allow for concrete

conclusions about the validity of each hypothesis, or to determine a need for refinement of

the hypotheses to narrow the scope and simplify applicability of the framework. This would

go a long way to help legitimize the Native Approach, and help educators create methods

based off of the approach. Consider the similiarities between the theories that underlie CLT

and Krashen’s hypotheses. Unlike CLT, Krashen makes grandiose pronouncements about his

hypotheses, whereas CLT theories have accumulated over time from various sources. By

making himself the poster child of this hypotheses, Krashen has also made himself into a

lightning rod for his critics.

Cooperative Language Learning

Due to the decentralized role that the teacher plays as a facilitator in a CBE classroom, the

group becomes the vehicle for learning. This is the pillar of CLL. Research from Salvin and

9

, 21/01/14,


, 21/01/14,


Baloche (in Richards and Rodgers, 2001, p.201) have found that overall there are benefits of

CLL, with the caveat that more research needs to be conducted, specifically in L2 classes.

Other criticisms, Richards and Rodgers raise (2001, p.201), include CLL’s suitability in

teaching learners of mixed proficiencies, and the approach’s limitations for beginners who

may not enjoy as many benefits as their counterparts. Another criticism that can be levelled at

CLL, comes from Johnson et al (in Richards and Rodgers, 2001, p.199), is the demands

placed on the teacher, who is expected to structure every aspect of the lesson, right down to

the physical layout and groupings of the class. Details will be provided below, but in

brief technology eliminates many of these obstacles.

Richards and Rodgers repeatedly note that CTL, the Natural Approach, and CLL are malle-

able approaches, and so therefore CBE is simply an amalgamation of the three approaches.

Testing Theory

In his book, Practical Language Testing, Fulcher discusses two mandates for testing.

The two mandates breakdown into local mandates (also know as classroom assessment),

“The key feature of testing within a local mandate is that the testing should be ‘eco-logically

sensitive’, serving the local needs of teachers and learners” (2010, p.2), and the external man-

date (also referred to as standardised testing), “An external mandate, on the other hand, is a

reason for testing that comes from outside the local context” (2010, p.2). The former is a tool

that helps teachers determine learners’ needs, and the latter is often used as a tool to inform

educational policy. Clashes valuing each mandate are persistent, but unnecessary for the fo-

cus of this paper. This study also ignores the debate that testing, as Folcault argues (in

Fulcher, 2010, p.8-11), is the best method of fairly evaluating learners. Considering the types

of testing laid out by Breaux (2013), this study is concerned with the aspects of motivation,

criterion-referenced testing, validity, reliability, and dependability.

As mentioned above, testing provides great motivation for students. Additionally, testing may

further learning, as Fulcher writes, by diagnosing “individual learning needs” (2010, p.68).

Black et al, he writes, were able to prove this theory through a large-scale project to measure

the effectiveness of formative assessment, which yielded “an effect size of 0.3” (2010, p.68).

Test are crafted using the measurement qualities of reliability and validity. Bachman (1990,

p.240) quotes Campbell and Fiske to explain that reliability measures aim to use two similar

10

, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


, 21/01/14,


methods to qualify a trait (i.e. the scores of similarly designed tests), while validity measures

strive to challenge the agreeability of different measures of the same trait (i.e. the results of a

multiple choice grammar test with the results of grammar scoring from an oral interview). It

is important to note the distinctions between the two, as Bachman (1990, p.240-241) ex-

plains, correlations can be “...interpreted as estimators of reliability or as evidence supporting

validity.” These estimations are the product of “both professional judgment and empirical re-

search” (Bachman, 1990, p.43). Essentially, validity and reliability, are the best quantifiable

measures with which to evaluate tests. Beyond validity and reliability, tests can be analyzed

on numerous criteria, which exceed the focus of this study2.

Validity

Fulcher uses Ruch’s explanation from 1924 (2010, p.19) to simply state that a test’s validity

is measured by its effectiveness in evaluating skills, knowledge, or other such demonstrable

features. Fulcher generalizes five aspects of validity (2010, p.20), which are paraphrased

here:

← Consequential validity: is the test developer’s responsibility to create testing ma-

terials that generate results that can be used to infer “about the knowledge, skills and abil-

ities of a test taker are justified,” (he calls this relevance and usefulness), and how use of

test results impact the test taker.

← Structural aspect: tests must be structured and scored to measure the abilities or

skills specific to the parameters of the test.

← Representative of content: test items can only cover the domain of study.

← Generalisability: scores should predict ability outside of a testing context.

← External aspect: how test scores relate to similar or different measures of skills

and abilities.

These terms represent Fulcher’s interpretation of validity. Hughes uses the more widely used

term of construct validity (2003, p.26-34) with broader definitions of its aspects:

2 For insight into all of the aspects of testing theory, see Bachman 1990.

11

, 01/21/14,

Hughes, A. (2003) Testing for Language Teachers. 2nd edn. Cambridge: Cambridge University Press.

, 01/21/14,


, 21/01/14,


, 21/01/14,


← Content validity: the test’s content represents a sample of the skills, abilities, or

knowledge with which the test is concerned. Skills, abilities or knowledge are specified

from the outset of test construction to achieve content validity. These will inform the

types of testing items that the test maker selects. This has the added benefit of determin-

ing the focus of the class. Increasing content validity increases the odds of accurate mea-

surement.

← Criterion-related validity: determines how well a test’s results measure up to a

other forms of assessment. Criterion-related validity can be determined by concurrent va-

lidity and predictive validity.

← Concurrent validity: allows the test maker to measure simplified tests (simplified

for practicality reasons) against a longer or full test (the criterion test) to find the how

well the simplified test results agree with the full test results. A correlation coefficient (or

validity coefficient) is calculated to find the level of agreement between the tests (this is

addressed below). Accepting the level of agreement is dependent on the test’s pur-

pose, so standardized tests will demand a higher level of agreement than eco-logically

sensitive (or local) tests. Concurrent validity may also be reached by comparing the level

of agreement with other forms of assessment, such as a teacher’s assessments.

← Predictive validity: attempts to establish a test’s predictive capabilities in estimat-

ing a test taker’s future performance.

← Validity in scoring: aims to measure specific language elements. Test items that

have a broad of focus are inaccurate because they ignore the intended focus of the item. If

an answer is unintelligible as a result of the student’s production, then this is a flaw of the

testing item.

← Face validity: test items that directly measure skills, abilities, or knowledge have

face validity. Test takers and/or administrators may have trouble accepting the results of

tests that indirectly measure skills, abilities, or knowledge. This can lead to performance

issues as test takers may not perform at full capacity. Test makers should heed caution

when considering indirect testing techniques, and deploy convincing rationales as to the

validity of such methods.

Hughes lists other forms of evidence for validity. Construct, as in construct validity, he

explains, “refers to any underlying ability (or trait) that is hypothesied in a theory of language

ability” (2003, p.31).

12

, 01/21/14,


Reliability

Just as the validity coefficient of a test must be calculated to establish its level of agreement,

so too must a test’s reliability coefficient be calculated. Hughes explains (2003, p.36-39) that

reliability coefficients help design tests that are reliable regardless of testing conditions, test

candidates, and scorers. Hughes cautions though, “To be valid a test must provide consis-

tently accurate measurements. It must therefore be reliable. A reliable test, however, may not

be valid at all” (2003, p.50). Even accepting the limitations of reliability, Hughes explains

that it is a necessary consideration, and in the same fashion of validity, may influence test

construction more or less depending on the type of test. So, standardized tests, which may be

of greater consequence to a candidate, likely require a higher reliability coefficient score than

do local tests or other classroom assessments. The reliability coefficient is merely a compari-

son of at least two identical tests’ scores. According to Hughes (2003, p.40), splitting a test

score in half to create two test scores, usually by splitting it by odd-numbered and even-num-

bered items, is the most efficient method (also know as the split half method), because it

the eliminates variables that arise from giving the same test twice (the re-test method) or

by altering forms of the same test (the alternate forms method). However, this fails to

account for changes in testing conditions that might affect outcomes. To address these vari-

ables, Hughes recommends (2003, p.40-41) that a standard error of measurement be

calculated to find the test takers true scores (which equates to his/her ability on the test).

Hughes explains how to find the standard error of measurement, “The calculation of the stan-

dard error of measurement is based on the reliability coefficient and a measure of the spread

of all the scores on the test...”, so, “...the greater the reliability coefficient, the smaller will be

the standard error of measurement” (2003, p.41). As he explains, this provides only a range

of probability of how the test taker should perform on future identical tests, a range that is the

original score plus and minus one, two, or three times the standard error of measurement.

This information, he suggests, may help inform decisions about how test results should be in-

terpreted depending on the test’s importance. So scorers of standardized tests used for aca-

demic purposes, or of tests for placement into courses of study may evaluate test takers’

scores based upon different ranges of the standard error of measurement. In the case of crite-

rion referenced testing (CRT), which is a form of assessment typically ap-

plied to local tests, standard error of measurement may be too blunt, according to

13

, 01/21/14,


, 01/21/14,


, 01/21/14,


, 01/21/14,


, 01/21/14,


Hughes (2003, p.42). CRT essentially allows the scorer to efficiently categorize test results

by cut-off points, or for example, “the point that divides 'passes' from 'fails'” (Hughes,

2003, p.42), to discern whether the test takers have achieved a set or sets of criterion.

Along with reliability coefficients and the standard error of measurement, scorer reliability

also impacts the overall reliability of a test, particularly on subjective test items like composi-

tion. Hughes explains, “If the scoring of a test is not reliable, then the test results cannot be

reliable either” (2003, p.43). Hughes recommends using the same approach of calculating test

reliability to calculate scorer reliability. Ideally, test reliability should be lower than scorer re-

liability. To prevent reliability discrepancies in scorers and tests, Hughes advocates that test

makers follow a number of steps in test construction, test administration, and scoring pro-

cesses, which can be found in his book Testing for Language Teachers (2003, p.44-

50).

Fulcher notes that, “Classroom tasks frequently look very different from the kinds of items

and tasks that appear in standardised tests. The main reason for this, as we have seen, is the

requirement that standardised tests have many items in order to achieve reliability” (2010,

p.70). He gathers, each item shapes the test taker’s ability, but since classroom assessment

can be spread over longer durations and take advantage of open-ended tasks, reliability is of

lesser concern. “Dependability is the criterion-referenced correlate of reliability in standard-

ized testing” (Fulcher, 2010, p.81).

Fulcher quotes (2010, p.76) Lantolf and Poehner’s views of reliability, validity, and dynamic

assessment, the former concepts are assessment instruments that suggest performance and de-

velopment arise from the individual, whereas the latter procedure is concerned with the ‘so-

cial individual.’

Criterion / Non-criterion Referenced

The Gap

As noted earlier, Breaux argues that Korea’s educational curriculum lacks a focus on speak-

ing. Certainly the TOEFL results support his claim. The issues that contribute to these include

testing, a culture of traditional lecture teaching (specifically audiolingual and grammar-trans-

14

, 01/21/14,


, 01/21/14,


, 01/21/14,


, 01/21/14,


, 01/21/14,


, 01/21/14,


lation methodologies), and obviously a lack of speaking opportunities. Breaux claims to have

developed CBE to meet the needs of these learners. Since CBE has not been published, there

is a great lack of both awareness and research. Given this gap, the underlying question of this

research is, does Gunther Breaux’s conversational fluency focused methods for teaching

Korean university students provide a suitable approach for teaching communicative English

skills? Answering this question requires answering the following questions:

1. Is it possible to produce a valid and practical placement test that measures receptive

and productive language skills?

2. Is Breaux’s speaking test valid, reliable, and practical, and can it track results as he

claims?

3. Is Breaux’s claim that more speaking results in improved speaking true?

The Significance

Hence the purpose of this research is to determine if it is possible to produce a valid and prac-

tical placement test that measures receptive and productive language skills. The study aims to

validate Gunther Breaux’s Placement Tests (PT), against the pre-validated Lextutor Tests

(LT). LT is designed to establish language competency by testing vocabulary size and depth

of knowledge. PT also attempts to measure these attributes using colloquial conversational

questions, as well as measuring course learning objectives, like pronunciation and preposi-

tional knowledge.

The focus of this study aims to validate PT, which is one of several cruxes of Breaux’s teach-

ing methodology, that will be explored in detail in a future study. The direct implications of

this research is the ability to quickly assess any set of learners quickly and provides the in-

structor with a greater amount of flexibility in managing classes. Presumably, a placement

test could be used as diagnostics tests. Although this study overlooks this possibility, Brown

(2003, p.46-47) implies that this notion may be justified. Finally, the exposure of knowledge

gaps of a given group can assist the instructor in modifying his/her syllabus to optimize for

the ‘sweet spot of difficulty’ (Hanford & Smith, 2013).

15

The Method

The Subjects

The study was conducted over a sixteen week period in eight co-ed classes in Korea at Soon-

chunhyang University. Among the eighty-one male and thirty-six female students (117 total),

ages ranged from nineteen to twenty-seven, with the average being twenty-two. 112 identi-

fied themselves as being from Korea, three from China, and two failed to complete the field.

Thirty-one were freshmen, seventy sophomores, twelve juniors, and four were seniors. El-

even were enrolled in the College of Humanities, fourteen in the College of Social Sciences,

thirteen in the College of Business, twelve in the College of Natural Sciences, thirty-nine in

the College of Engineering, twenty-two in the College of Medical Sciences, and seven in the

College of Medicine. Students reported that, on average, they had had seven years of English

education. Although some reported having had no English instruction, ninety-four reported

having received instruction at the high school level, and none had left the field that corres-

ponds to that question blank, which indicates some confusion. The students were enrolled in

English classes to meet a graduation requirement. Only one class, Native English 3, had stu-

dents opted for a higher level course. Although there was no placement testing, the Medical

Native English 2 class only accepted students from the College of Medicine.

The Class Structure

Attempts were made to adhere as closely as possible to the methods and syllabus outlined by

Breaux (2013):

16

, 21/01/14,


Beginning on the first day of classes, students were given Breaux’s placement test, and intro-

duced to the ‘speed dating’ activity (see below). The second class mirrored the first day, only

with the Lextutor test (discussed below) being given. On the fourth class3, students conducted

the speed dating activity in pairs and in groups of threes. In the last hour, students that had

not completed Breaux’s placement test or the Lextutor test were asked to stay and complete

the tests, while others were allowed to leave. In each of the classes leading up to the midterm

speaking test, instruction was brief and consisted of the following topics:

• Speaking strategies: asking follow up questions, using examples, using body lan-

guage, and using partners or the instructor as resources.

• Task completion: how to use the questions, how to engage with multiple partners, and

how to go beyond the supplied questions.

• Motivation and justification of methodology: explaining format of the speaking test,

explaining how speaking is a skill, and explaining how to improve speaking skills.

• Miscellaneous: homework completion, syllabus and grading, and other aspects of

classroom administration.

Although Breaux did not specify content to be covered, he does stress (2013) that the amount

of instruction time should decrease to allow students more time for production. Also, gram-

mar instruction, he argues, is unnecessary since Korean learners will have received plenty of

grammar instruction before arriving at university, so it seems reasonable to assume that what

Long terms as focus on FormS (in Cook 2008, p.39) should be avoided. Even though Br-

eaux did not specify the teacher’s role, all of this leaves the teacher ample time to check stu-

3 The third class was canceled for Korea’s annual harvest festival.

17

SUMMATION 1. First day I know the English ability and speaking ability of my students.

From that point on, all my energies are to improve ability, not determine ability.

2. First week I change the students’ mindset to communicative speaking.Communication gets rewarded, not grammar.

3. Weeks 2 - 6 Students speak a lot, specifically, speed dating. One topic (Me), many partners. Communicate in class, learn to communicate at home.

4. Week 7 Midterm communicative speaking test. Weeks 9 - 13 More speaking.

The class prepares students for the test, and the test better equips students for the class.

6. Week 14 Final communicative speaking test. Week 16 Improvement is measured, proven and shown on the big screen.

Accuracy improves from speaking with better speakers, personal feedback comes from the transcripts.

, 21/01/14,


, 21/01/14,


dents’ homework, and move about the classroom as a resource for students, or as a facilitator

as Harel defines (2001, p.199) it in Richards and Rodgers.

Post midterm test, classes consisted of about forty minutes of warm-up speed dating activit-

ies, about fifteen minutes of activity instruction and setup, and forty-five minutes or more of

activities. Breaux did not go into detail as to the kinds of activities that are relevant, but does

suggest that variety in activities can be used to drive repetition of content, which in practice is

paradoxical. After half a term of repeating the same activities, it became apparent that greater

variety in content and activities was needed to motivate and keep the students’ interested, es-

pecially in a two hour classroom setting. In addition to these curriculum alterations, written

midterm and final tests based on the provided texts were given as per the university’s require-

ments (in only the Medical Native English 2 class were Breaux’s textbooks used).

Class Materials

As part of his presentation for CBE, Breaux proposed using a set of classroom materials to

assist in classroom management and activities. To prepare students for these tests, class time

is devoted to semi-structured conversation. Beginning from the first class, students engaged

in speed dating, in which pairs conversed using provided questions as conversational fodder,

switching partners every five minutes. This continued for the first two classes, with times in-

creasing to ten minutes in the second class (Breaux structures his classes around two one-and-

a-half hour classes). The questions are personal and topical, without being invasive to avoid

awkwardness. Questions are unordered, allowing students to skip questions. In the fourth

class, students were grouped into threes and made to speak for twenty minutes. These activit-

ies filled the remaining classes leading up to the midterm speaking test (class seven).

It is vague what Breaux “increasing variety” implies, that is whether to vary group arrange-

ments, content, times, or to provide something more akin to the Natural Approach. Breaux

has published several textbooks designed to facilitate activities, which he promoted in his

presentation, but he did not completely explain this aspect. Breaux’s Jazz English text

books are meant to provide activities that aim to get students to use descriptors and photo-

copiable board games that recycle the questions from the speed dating activities. As previ-

ously mentioned however, students’ interest in rehashing the same questions appeared to be

waning, as Figure 1 suggests.

18

, 21/01/14,


For the remainder of the term, spoken word activities, like parlour games and drinking games

(sans the drinking), were employed to motivate and stimulate and motivate, reemphasise the

speaking strategies which included asking follow up questions and reasoning, and introduce

new language structures.

Testing Materials

Research Materials

The Frame of Reference

The Research Limits

The Research Question

The Theoretical Basis

The twenty-minute speaking tests require students to conduct an impromptu conversation

based on a single question in groups of three. Several factors determine a students’ grade

(ability). Breaux assigns values for “basic speaking ability,” “total vocabulary,” “total words

spoken,” and the “number of topics covered” to determine overall speaking ability. The con-

versations are recorded, and then emailed to students, which they must dictate and return to

19

class the following week. These dictations are meant to provide self analysis, so each student

can examine their performance. Part of the examination process requires students to count

their total number of words spoken, find their five most common mistakes and fix them. Stu-

dents are also asked to list vocabulary they have used. Breaux suggests that students may

make mistakes in speaking that they would not make on a writing test, and that by forcing

students to observe their mistakes, they are being made conscious of them. Breaux offers fur-

ther input, by arranging meetings with the speaking test groups after the dictations have been

completed. These meetings are an opportunity for the teacher to provide feedback to the stu-

dents and offer them a chance to ask for clarifications, possibly, although Breaux makes no

mention of this aspect, to manufacture Long’s FonFs (in Cook 2008, p.40). The test is re-

peated for the final test, only with different groups. Groups are determined by the results of

the midterm speaking test, matching students by score.

In determining the midterm speaking test groups, Breaux relies on a placement test that he

gives at the beginning of the class. It should be noted that the placement test is more of an

amalgamation of the proficiency and placement tests that Bachman (1990, p.44-46) explains

are meant to test overall ability and establish students’ levels, respectively. Breaux’s place-

ment test attempts to promptly measure grammar and vocabulary knowledge and productive

and receptive skills. His thirty-minute multiple choice test is comprised of questions based on

listening skills, preposition usage, and conversational listening accuracy.

20

, 21/01/14,


, 21/01/14,


Bibliography

21

Dissertation 007

Documents

Transcript of Dissertation 007