Download - ASSESSMENTOF DIRECTED WRITING BYA GROUP …pustaka2.upsi.edu.my/eprints/582/1/Assessment of directed writing... · directedwritingDW51toDW75 1.1B Thescoresgivenbytwenty-eightTESLstudentsandoneexpertraterto

ASSESSMENT OF DIRECTED WRITINGBY A GROUP OF TESL STUDENTS IN UPSI

NORMAH BINTI OTHMAN

FACULTY OF LANGUAGES

UPSI RESEARCH CODE: 01-03-41-02UPSI RESEARCH ACCOUNT: 050554

2003

No DescriptionTable of contentList of figuresList of tablesAbstractAbstrak

TABLE OF CONTENT

Page23456

1.1 Introduction 71.2 Statement of problem 101.3 Objectives 121.4 Research questions 141.5 Literature review 141.6 Research design 231.7 Instrumentation 251.8 Data collection 251.9 Data analysis 261.10 Result 26

1.10.1 The scores obtained from the assessment of dir.ected writing 261.10.2 Salient features of directed writing 'assessment 44

1.11 Discussion and recommendation 521.12 References 531.13 Appendix 57

2

LIST OF FIGURES

No. Description Pagel.1 Sommer's and Cohen's criteria for assessment ofwriting 181.2 Categories for evaluating writing by Brown (2001) 20

3

LIST OF TABLES

No. Description Page1.1A The scores given by twenty-eight TESL students and one expert rater to 27

directed writing DW51 to DW751.1B The scores given by twenty-eight TESL students and one expert rater to 28

directed writing DW51 to DW75 (contd)1.1C The scores given by twenty-eight TESL students and one expert rater to 29

directed writing DW51 to DW75 (contd)1.2A Categories of scores by Teacher A and Expert Rater 1 301.2B Categories of scores by twelve raters using analytic scoring method 321.2C Categories of scores by seven raters using primary trait scoring method .... "

.).)

I.2D Categories of scores by eight raters using holistic scoring method 351.3 Descriptive statistics of scores given by twenty-eight TESL students and 39

one expert rater to twenty-five directed writing samples (DW51 to DW75)1.4 Non-parametric correlation coefficient (Spearman rho) of scores obtained 41

from twenty-eight TESL students and one expert rater1.5 Parametric correlation coefficient (pearson r) of scores obtained from 43

twenty-eight TESL students and one expert rater

4

Abstract

Twenty-eight TESL students in UPSI and one experienced English SPM Examinationexaminer (Expert Rater 1) assessed twenty-five samples ofForm Four ESL secondaryschool students' writing. Twelve of the TESL students used the analytic scoring method;seven used the primary trait scoring method; and eight used the holistic scoring method.The expert rater and one of the TESL students (Teacher A) used the English SPMExamination scoring method. Teacher A and Expert Rater 1 assessed the writing samplesindividually, but the other twenty-eight TESL students were gathered during three

separate seminars and workshops. The scores that twenty-eight TESL students, TeacherA and Expert Rater 1 gave to the writing samples were correlated using descriptivestatistics and parametric (Pearson R) and non-parametric (Spearman Rho) calculations.The analysis of these scores showed that there was significant correlation coefficientamong the scores obtained from all the subjects, even though they used three differentscoring methods to assess the writing samples. The TESL students who took part in theassessment during the seminars and workshops agreed that the three scoring methodswere suitable for classroom assessment as compared to the English SPM Examination

scoring method, which was more suitable for standardized assessment. A few strengthsand weaknesses of each scoring method were identified and solutions for use in theclassroom assessment were recorded in the salient features of assessment.

5

Abstrak

Dua puluh lapan pelajar TESL dari UPSI dan seorang pemeriksa kertas peperiksaanBahasa Inggeris SPM (Expert Rater 1) telah memeriksa dua puluh lima sampel penulisanpelajar Tingkatan Empat dari sekolah menengah. Dua belas orang dari pelajar tersebuttelah menggunakan skema permakahan analytic; tujuh orang menggunakan skema

permakahan primary trait; dan lapan orang menggunakan skema permakahan holistic.

Expert rater dan seorang dari pelajar tersebut (Teacher A) telah menggunakan skema

permakahan Bahasa Inggeris SPM. Teacher A dan Expert Rater 1 memeriksa sampelpenulisan secara individu, tetapi dua puluh lapan pelajar TESL yang lain telah

dikumpulkan dalam tiga seminar dan bengkel yang berlainan. Skor yang diberi oleh dua

puluh lapan pelajar TESL, Teacher a dan Expert Rater 1 telah dikorelasi menggunakanstatistik deskriptif dan parametric (Pearson R) dan non-parametric (Spearman Rho).Analisis skor ini menunjukkan terdapat korelasi yang signifikan, walau pun subjek telahmenggunakan skema permarkahan yang berbeza untuk memeriksa sampel penulisanpelajar sekolah. Pelajar TESL yangmengambil bahagian dalam seminar dan bengkelbersetuju bahawa ketiga-tiga skema permarkahan sesuai digunakan untuk penilaiandalam kelas jika dibandingkan dengan skema permarkahan Bahasa Inggeris SPM yanglebih sesuai untuk penilaian piawai. Beberapa kekuatan dan kelemahan setiap skema

permarkahan telah dikenalpasti dan resolusi untuk penggunaan dalam kelas telahdirekodkan oleh pelajar TESL semasa seminar dan bengkel.

6

1.1 INTRODUCTION

TESL teachers' assessment ofESL students' writing plays an important role in the

process of teaching. It is also important that their assessment provides confidence and

motivation for students to excel in their school-based, district-based, state-based and

national-based ESL examinations in Malaysian secondary schools. The TESL teachers in

schools normally handle the school-based ESL examinations, and a selected group of

TESL teachers who work together in a committee handles the district-based and the state

based ESL examinations. The Malaysian Examination Board (Lembaga Peperiksaan

Malaysia) handles the national-based ESL examination at the Sijil Peperiksaan Malaysia

(SPM) level.

The function of the school-based, district-based and state-based examinations in

Malaysian secondary schools is to give some insight into the progress of students'

learning and achievement while still in school, whereas the national-based examinations

are to give final grades that determine the students' future undertaking in their studies.

Nevertheless all levels of examinations are important to the students' learning process.

As Rabinowitz (2001) says, statewide, or nation-wide tests cannot yield the detailed

information necessary to target instruction for individual students. So this leaves a clear

and essential role for local assessments or school-based assessments to develop

diagnostic information about what students do well, where they are having difficulty and

how the instructional program might be adjusted to address their specific needs.

7

For school-based, district-based and state-based examinations, teachers normally adopt

certain sets of scoring methods to assess their students' writing depending on the

objectives of assessments. And there are many types of scoring methods available for

teachers to refer to when assessing their students' writing tasks. Each scoring method is

different from another in the sense that each has different criteria oflooking at students'

writing product. For example, one scoring method looks at a student's writing product

generallyand does not go into detail about analyzing the student's grammar performance,

whereas another scoring method looks into detail the grammar performance. No matter

what kind of criteria each scoring method has, the ultimate aim is the same, that is, to

grade students' writing.

It is important to make sure that the interpretation of students' writing performance,

regardless ofwhat kind of scoring methods used, helps the students in their process of

learning. It was discovered that a few scoring schemes in several countries/school

systems employ similar aspects. This means that these school systems are interested in

evaluating the same elements in students' writing. That is why the kinds ofwriting

offered to students in almost all schools are more or less in the same form and serves the

same function (Takala, 1988).

Some of the kinds ofwriting that teachers normally give to their students in schools all

over the world are essays, summaries, note taking, letter writing, paraphrasing, report

writing and directed writing. Teachers in assessing their students' performance in

language normally use these kinds ofwriting to evaluate students' language performance.

8

"Written language has always played a dominant role in formal education. Typically the

acquisition of literacy is considered to be one of the most important tasks of the school,

not only as a vehicle of learning, but as a means of achieving other goals as well"

(Takala, 1988). Written examination in the SPM examination is a very important type of

assessment that determines the students' performance in the language use. And the result

.

of the assessment is important because a good grade for English examination is a

prerequisite for the students to further their studies in certain fields such as medicine,

pure science and mathematics, especially at a foreign institution.

The International Association for the Evaluation ofEducational Achievement (IEA) has

carried out several studies on writing tasks and scoring scale. The IEA, which was

founded in 1959, had conducted much research to compare the educational performance

of school students in various countries and systems of education around the world

(Gorman, 1988: vii). The IEA's study ofwritten composition began in 1980 and the

findings were published in several volumes. The writing tasks studied were pragmatic

writing, letter writing, summary writing, descriptive writing, narrative writing, open

writing, argumentative/persuasive writing and reflective writing. There are also studies

conducted to investigate the effectiveness of some scoring methods used to assess

students' writing tasks.

9

1.2 Statement ofProblem

TESL teachers who conduct the school-based, district-based and state-based assessments

ofwriting provide immediate feedback for the students, and thus enable the students to

progress in their learning process, and in their preparation for the national-based

examination. The nation-based assessment ofwriting provides grades that determine the

students' future undertakings in their further studies. Expert raters conduct the national

based assessment and the Malaysian Examination Board (Lembaga Peperiksaan

Malaysia) trains the expert raters to assess the national examination. It is important then

for TESL teachers to balance their school-based, district-based and state-based

assessments with the national-based assessment ofwriting for the sake of students'

learning progress and school service improvement.

The issue about balancing school-based assessment and the nation-based assessment does

not only apply to Malaysian schools. In America for instance, the issue about balancing

state and local assessment has been raised by school administrators. State assessment in

America refers to their national level assessment, and local assessment is their school

based or district-based assessment. Rabinowitz (2001) in his article about balancing state

and local assessments in American schools finds that local assessment programs are still

relevant because effective local assessment is essential to improved student learning, and

that locally developed and administered assessment programs have unique capacity to

provide diagnostic information that, when understood and used effectively, has

immediate impact on classroom practice.

10

Another researcher who shares the same view with Rabinowitz about balancing state and

local assessment in America is Stiggins (2002), who believes that the current assessment

systems in American education are harming huge numbers of students and that harm

arises directly from the failure to balance the use of standardized tests and classroom

assessments in the service of school improvement. He also believes that student

achievement suffers because the once-a-year tests are incapable of providing teachers

with the moment-to-moment and day-to-day information about student achievement that

they need to make crucial instructional decisions. Stiggins (2002) suggests that teachers

rely on classroom assessment to make crucial instructional decisions. However the

problem is that teachers are unable to gather or effectively use dependable information on

student achievement each day because of the drain of resources for excessive

standardized testing. There are no resources left to train teachers to create and conduct

appropriate classroom assessments. For the same reason, Stiggins (2002) states that

district and building administrators have not been trained to build assessment systems

that balance standardized tests and classroom assessments. As a result of these chronic

and long-standing problems, classroom; building; district; state; and national assessment

systems remain in constant crisis, and students suffer the consequences.

The issue about balancing local and national assessment involves students' performance

in writing, because locally and nationally standardized examinations normally require

students to write continuously in a given period. For example, the P:MR, the SPM, the

STPM and the MUET nationally standardized examinations in Malaysia involve writing

components. So writing is commonly used to assess students' language skills and their

11

learning in many academic content-areas. Thus the need to provide students with fair and

supportable assessment approaches is very important because many decisions rest on

writing assessment. It is imperative that decision makers, national examiners, national

raters and schoolteachers who provide language performance report based on assessment

of students' writing give a fair report that really depicts the students' actual performance

because the report given determine the students' future undertakings and even future

career. So it is necessary to study the needs for validated school-based assessment

methods that TESL teachers can make used of to balance school-based and nation-based

assessment.

1.3 Objectives

Teachers assign writing tasks for different instructional purposes: to have learners imitate

some model ofwriting; to train learners in the use and manipulation of linguistic and

rhetorical forms; to reinforce material that students have already learned; to improve

learners' writing fluency; to encourage authentic communication whereby the writer

really wants to impart the information and the reader is genuinely interested in receiving

it; and to learn how to integrate all the purposes mentioned above, with the emphasis on

improving the whole performance, not just one of its aspects (Raimes, 1987 as quoted by

Cohen, 1994). Taking into consideration all these purposes as to why teachers assign

writing tasks to their students, the present study developed two main objectives that cover

TESL teachers' role in assessing their students' writing tasks; and the effectiveness of the

holistic scoring method, the analytic scoring method and the primary trait scoring method

to assess directed writing.

12

The two main objectives are:

1. To investigate TESL teachers' assessment of directed writing product written by

Form Four ESL secondary students. This writing product was given in Paper Two

of the English SPM Examination in Malaysia.

2. To establish the construct validity of the holistic scoring method, analytic scoring

method and primary trait scoring method to assess directed writing in ESL

classrooms in Malaysia.

The specific objectives ofthis study were:

1. To record and analyze the salient features of assessment verbalized by the TESL

teachers as they assessed directed writing.

2. To design and establish the holistic, analytic and primary trait scoring methods to

assess directed writing.

3. To test the validity and reliability of the holistic, analytic and primary trait scoring

methods designed to assess the directed writing.

4. To identify the strengths and weaknesses of the holistic scoring method, analytic

scoring method and primary trait scoring method used in this study to assess

directed writing.

5. To establish the concurrent validity of the scores obtained from the TESL teachers

after assessing directed writing using the holistic scoring method, analytic scoring

method and primary trait scoring method, as compared to the scores given by the

13

expert raters who trained by the Malaysian Examination Board (Lembaga

Peperiksaan Malaysia).

1.4 Research Questions

1. How did TESL teachers assess directed writing, and what were the salient

features of assessment that they verbalized as they reacted to the writing products

using the holistic scoring method, the analytic scoring method and the primary

trait scoring method?

2. To what extent were the holistic scoring method, the analytic scoring method and

the primary trait scoring method valid and reliable for assessment of directed

writing?

3. What was the relationship between the scores given by TESL teachers using the

holistic scoring method, the analytic scoring method and the primary trait scoring

method with the scores given by the expert rater, using the SPM examination

scoring method?

105 Literature Review

In Malaysia, three major national examinations are given to secondary school students.

These examinations are standardized nationally in terms of the questions given and the

assessment of the examinations. The examinations are known as the Penilaian Menengah

Rendah (P:MR) for the Form Three students (aged fifteen to sixteen years old); Sijil

PelajaranMalaysia (SPM) for the Form Five students (aged seventeen to eighteen years

old); and the Sijil Tinggi Pe!ajaranMalaysia (STPM) for the Form Six students (aged

14

nineteen to twenty years old). These examinations that are in the form ofwritten tests,

oral tests and practical tests serve to evaluate the school students' performance in several

subjects. The result of the examinations determines the students' chance for further

studies. They are graded according to their performance in the examination. For example,

students who get good grades in the science subjects will be placed in the science stream

classes; and students who get good grades in the arts subjects will be placed in the arts

stream.

The English Language is offered as a subject in all the national examinations at

secondary school level. At the PN1R and the SPM level, English is compulsory for all

students. At the STPM level, the subject is offered as an elective where the literature

elements are also included. However, since 1999, a new subject known as MUET

(Malaysian University English Test) is offered as a compulsory subject to all Form Six

students. In this subject, four skills are tested in separate examinations: listening skills in

Paper One, speaking skills in Paper Two, reading comprehension skills in Paper Three,

and writing skills in Paper Four. Starting from the year 2001 onwards, students who wish

to continue their studies at the tertiary level will have to take MUET as a prerequisite.

English as a Second Language (ESL) is a compulsory subject in the SPM examination. It

is crucial for ESL students to do well in the subject due to the importance of the

language. There are three major components tested in the SPM English Examination:

Oral English, Paper One and Paper Two. In Paper Two of the SPM examination, the

15

students are tested on three types ofwriting: directed writing, summary writing and essay

writing. The three types ofwriting tasks given in Paper Two of the SPM examination are

very important for secondary school ESL students in Malaysia. As much as it is

important for the students to do well in the writing tasks, it is also equally so for TESL

teachers to assess their students' writing well enough to ensure that the grades given

really depict the students' actual performance in writing.

Teachers' assessment of students' writing can greatly influence students' attitudes for

future learning because they can be easily confused by unclear, vague or ambiguous

responses and can become frustrated with their writing progress and hopes for the results

in their examination. Alternatively, students can be positively motivated if the assessment

given to their written work reflects their actual performance in the national level

examination. Unfortunately, there is no clear set of universal guidelines that will

guarantee such a supportive and positive experience for all students. In a given context

for writing instruction, students will differ, and tasks, topics, and responses will differ

(Grabe and Kaplan, 1996: 377).

Schoolteachers might be using different ways and methods to assess their students'

writing tasks, depending on the school authorities' instruction to them. Students will not

be able to predict their actual performance in the national examination if the performance

assessment method adopted by their ESL teachers may not be the same as the national

raters' performance assessment method. Normally, TESL teachers in secondary schools

invite the English SPM national raters who are considered as the expert raters to come to

16

their schools to conduct seminars and workshops for their students before they sit for the

national examination. This is to ensure that their students get some exposure about the

national raters' expectation when assessing their writing product in the national level

examination. The national raters are trained by the Malaysian Examination Board

(Lembaga Peperiksaan Malaysia) on how to assess the English SPM examination

according to their standards. The Malaysian Examination Board has its own scoring

method that national raters or the expert raters have to refer to when assessing the English

SPM examination papers.

Each scoring method used to assess students' writing tasks has its own unique ways of

looking into the details of the writing pieces. However the ultimate aim is the same that is

to give grades to the writing pieces. The problem is whether the scores given by one

examiner or rater differ from the scores given by another examiner or rater who uses a

different scoring method, ifboth of them have been assigned to rate the same writing

task. Ifwe look at the elements suggested by two different researchers, we will see the

differences in each focus. For example, Sommer (1989) and Cohen (1994) have

suggested someguidelines about the elements to look at when assessing students' writing

(see Table 1.1). There are some similarities and differences in their suggestions.

17

Figure 1.1 Sommer's and Cohen's criteria for assessment ofwriting

Sommer (1989) Cohen (1994)

Can the students write in grammatical Appropriateness of languagesentences? conventions - grammar, spelling,

punctuation;

Accuracy ofmeaning - selection anduse of vocabulary;

How well do students manage the basic Organization - sense of pattern for themechanical elements ofwriting (such as development of ideas;paragraph indentation, punctuation, and

Rhetorical structure - clarity and unityspelling?of the thesis;

How well do they organize information? Style - sense of control and grace;

Can students find a starting place to write?

Can they formulate opinions or arguments inwriting?

Do they stay on the assigned topic?

Can they describe a process?

How well do they understand the formats inwhich they must write?

How well do they understand their Reader's acceptance - efforts made in

audience(s)? the text to solicit the reader's

agreement, if so desired.

Reader's understanding - inclusion ofsufficient information to allow meaningto be conveyed;

Are any students experiencing dialect orsecond-language interference?

Register - appropriateness of level offormality;

Content - depth and breadth of

coverage;

Economy - efficiency of language use;

18

The first four columns in Sommer's and Cohen's show some similarities in the elements

that both look into when assessing students' writing pieces. For example in the first

columns of Sommer's and Cohen's, we can see that grammar is taken into consideration.

However Cohen's description is more detailed because he has included vocabulary as one

element of grammar.

In column five of Sommer's we can. see that this element does not appear in Cohen's.

Sommer looks into whether there is any interference from the dialect and second

language, and Cohen does not take that element into consideration. Instead Cohen takes

into consideration register of the writing piece, which Sommer does not.

In a more recent study, Brown (2001: 357) has come out with six categories that raters or

teachers should consider when assessing students' writing. He has adapted these

categories from J.D. Brown, 1991. Table 1.2 shows the categories that he has listed down

for assessing students' writing. He has listed the categories according to its importance.

19

Figure 1.2 Categories for evaluating writing by Brown (2001)

Content

Thesis statement

• Related ideas

• Development of ideas through personal experience, illustration, facts, opinions

Use of description, cause/effect, comparison/contrast

• Consistent focus

Organization

• Effectiveness of introduction

• Logical sequence of ideas

• Conclusion

• Appropriate length

Discourse

• Topic sentences

• Paragraph unity

• Transition

• Discourse markers

Cohesion

• Rhetorical conventions

• Reference

• Fluency

• Economy

• Variation

Syntax

Vocabulary

Mechanics

Spelling

• Punctuation

• Citation of references (if applicable)

• Neatness and appearance

20

As we can see, there are some differences in the categories given by Sommer (1989),

Cohen (1994) and Brown (2001). Ifone group ofTESL teachers make use of one of these

suggestions, and another group makes use of the other suggestion, there mayor may not

be some differences in their scoring system. This is because some of the elements stated

in each of the scoring methods do not exist in another's.

There are also similarities and differences in the scoring methods available for examiners

and raters to refer to when assessing students' writing. The similarities will not cause

problems for examiners or raters to give grades if they refer to different scoring methods.

The problem lies in the differences because it might cause differences in scores given to

the students' writing. Ifwe look into two scoring methods and compare the elements that

each look into, we will see some differences in their focus. For example holistic scoring

method looks into one single integrated score ofwriting behavior. It is interested in

responding to the writing as a whole and respondents are unlikely to be penalized for

poor performance on one lesser aspect, for example grammatical ability (Cohen,

1994:314). On the other hand, primary trait scoring method narrows its focus on a

specific aspect of the writing piece. So if two TESL teachers assess the same writing

piece, one with the holistic scoring method and the other with primary trait scoring

method, their focus while assessing the writing piece is definitely not the same. The

problem is whether the scores that they give to the same writing piece differ or not.

21

Cohen (1994:312) states that writers and teachers or raters differ in so many aspects

related to the assessment ofwriting. He quotes Ruth and Murphy as saying that:

1. Writers will differ in their notions about the significance of particular features of

the topic.

2. Students and their teachers (raters) differ in their recognition and interpretation of

salient points in a writing topic (with teachers having a wealth of professional

experience in the evaluation ofwriting while students have only their own

experience as test takers).

3. Student writers may construct different writing tasks for themselves at different

stages in their development.

Ruth and Murphy's findings, as quoted by Cohen (1994) above tell us that it is

universally accepted that writers and their raters differ in some ways or other. Even if two

raters are given the same scoring method to assess the same writing piece, there are

bound to be differences in their judgment.

There are a few studies that look into assessment ofwriting performance and their score

relationship. For example Johnson, Penny and Gordon (2001) studied score resolution

and the inter-rater reliability of holistic scores in rating essays; Hayes (2000) studied the

consistency of student performance on holistically scored writing assignments; and

Swartz, Hooper, Montgomery, Wakely, et al (1999) studied on using generalizability

22

theory to estimate the reliability ofwriting scores derived from holistic and analytical

scoring methods. Despite ofmany studies conducted that were related to writing

assessment and scoring relationship, Crehan and Hudson (2001) stated that unresolved

concerns remain for the more basic issues of objective and reliable scoring of

performance assessments, especially for writing products. Both of them conducted a

study on a comparison of two scoring strategies for performance assessments.

1.6 Research Design

This research involves a case study with twenty-eight TESL students from Universiti

Pendidikan Sultan Idris and one experienced TESL teacher. One of the TESL students

was involved in teaching a class ofForm Four ESL students, during which the directed

writing samples were taken as instrument for this research. Three separate seminars and

workshops were conducted to gather twenty-seven TESL students, during which they

were required to assess the directed writing samples. While assessing the directed writing

samples, their salient features of assessment were recorded. The experienced TESL

teacher was an expert rater who had twelve years experience in assessing the English

SPM Examination papers.

During the seminars and the workshops the TESL students were trained to assess the

directed writing samples using the three scoring methods devised for this research.

McNamara (2000:44) believed that initial and on-going rater training was an important

way to improve the quality of rater-mediated assessment schemes. The training normally

took the form of a moderation meeting, and this moderation meeting had the function of

23