final

1.0 INTRODUCTION

1.1 Purpose

To investigate the effectiveness of a set of 30 multiple-choice questions on

English for Science and Mathematics subject in an upper secondary class in

Sekolah Menengah Kebangsaan Gajah Berang.

1.2 Objectives

1.2.1 To plan and develop a set of 30 multiple-choice questions on English or

Science and Mathematics based on the syllabus, content (topics ad sub-

topics), instructional objectives, and Table of Specifications.

1.2.2 To assemble the 30 multiple-choice questions.

1.2.3 To administer the 30 multiple choice questions

1.2.4 To score and grade the performance of the students in the 30 multiple

choice questions.

1.2.5 To present the performance of the students using descriptive statistics

such as measures of central tendency (Mean, Median, Mode), Measures

of Dispersions/ Variability (Range, Variance, Standard Deviation) and Z-

scores & T- scores.

1.2.6 To analyze the 30 multiple choice questions using item analysis (item

difficulty & item discrimination) and distracter analysis.

1.2.7 To discuss the results of the descriptive statistics, item analysis and

distracter analysis.

1.2.8 To provide a conclusion for the project.

2.0 METHODOLOGY

2.1 Subjects

In conducting this research, we assessed 20 students of form 5 from Sek. Men.

Keb. Gajah Berang. The questions are on English for Science and Technology

(EST). The sample students provided by the teacher are the representatives from

form 5 Science 1, 5 Science 2, 5 Science 3, and 5 Science 4. Even though they

may seem differ from classroom segregation of academic abilities, but according

to the teacher, Madam Fang, they are from homogenous group. They possessed

similar abilities for they are the science stream students. The teacher picks the

students from each science stream class randomly. 5 representatives have been

chosen from each four classes mentioned above in order to assess overall

students’ performance precisely regardless of their academic abilities. It is also

for the purpose of to spread or vary our students sampling in this research to be

more accurate. As stated earlier, there are 20 students altogether. Four female

students and the rest which are sixteen of them are male students. According to

the school’s principal and the Head teacher of English Language of the school,

students are from middle class family and few of them are from upper middle

class family. Therefore, financial problem is not a big deal for most of them. As

for the time allocation of EST subject per class, Madam De Kwee Poh (EST

teacher for the 5 Science 1) said that 3 hours per week were allocated for each

class.

2.2 Materials

In developing the questions, we done our surveys in the bookshops and

picked several exercise books that follow the new Form 4 and form 5 KBSM

syllabuses. After finish discussing on which sample questions is the best and that

relate to the new KBSM syllabuses as well as the Bloom’s Taxonomy of

Educational Objectives, we have decided to choose form 4 EST exercise book,

published by Oxford Fajar. The test consist of 30 multiple choice questions,

varies from their syllabus and Bloom’s Taxonomy of Educational Objectives level

and the tests is mid-year assessment for Form students. According to our table

of specification, there are six questions fall under Knowledge, twenty two under

Comprehension and one question fall under Analysis, and Application stage,

respectively. According to the syllabus, there are eleven topic included in the test

given. The topics are Treasures of Nature, Energy Comes & Goes, its All

Chemistry, Force & Motion, Tiny Beings Great Terrors, It’s All In The Genes,

Meddling With Nature, Food and thoughts, The World at Your Fingertips (ICT),

The Frontier of Space, and Reading New Horizons. Basically, EST does not have

specific syllabus for each form 4 and form 5. According to the curriculum

specification, both form share the same syllabus and curriculum specifications.

And their textbook also is of the same thing. They use the same textbook in form

5 that they have been used in form 4. Therefore, although we use form 4

exercise book, our students sampling are from form 5. The details of 30 multiple-

choice questions later were explained in planning and staging stage.

2.3 PROCEDURES

2.3.1 Planning Stage

Before we start to construct the test question, all of us gather together to discuss

what subjects do we want to measure for this project. After a few discussions

with all of the group members, we come to an agreement and choose the English

Science and Technology (EST) subject as common ground to be measure. After

that we went to Sek. Men. Keb. Gajah Berang, and see the EST teacher to find

out the syllabus of the subject and how many topics have already been taught or

covered by the teacher because as we all know, a test should measure what has

been taught by the teacher. This information is important to us to facilitate the

test that we are going to construct for the students. Once we have analyzed the

information that we have gathered, we starts to develop our table of specification

which will serve as our guideline to make sure that our test contents will be

closely related to the classroom curriculum and educational objectives. The table

of specification is very crucial because it help us to determine what the major

content areas are to be covered in the test. These content areas are derived by

carefully reviewing the educational objective and selecting major content to be

included in the test which will measure different level of Bloom’s Taxonomy of

education. Thus, it is essential to refer to the table of specification to ensure as

wide a sampling of the potential content as possible. In order to get the general

ideas of how the EST test should look like, we go through several of EST past

year papers and closely discuss with the teacher what are the suitable question

to be located or set in our test paper to make sure its reliability and validity. After

all of the process above have been established, we start to construct our

question for the test and distribute it to the teacher to be checked to make sure

its accuracy standards and whether it measure what are suppose to be measure

for the subject.

2.3.2 Assembling Stage

The test papers consists of 30 multiple choice questions and have to be

completed in 1 hour. For question number 1 it comes from Man and Human Body

under topic 9 in the syllabus which measure the comprehension level of Bloom’s

Taxonomy. While question number 2 is from topic 5 Natural Resources and

Industrial Process which also measure the comprehension level of the cognitive

taxonomy. Question 3 also comes under topic 5 but it measures the lowest level

of Bloom’s Taxonomy which is Knowledge. Furthermore, Topic 15; The Universe

Astronomy Aerospace is set for question number 4 which will measures the

comprehension level of the Taxonomy. We move to question number 5 which

going to test the student’s comprehension level of the Bloom’s Taxonomy from

topic 3 Natural Resources.

Now we move to question number 6 which come under topic 6, Matter & Mass

which also measure the comprehension level of Bloom’s Taxonomy. As question

number 7, it measures the comprehension level of the Taxonomy under Topic 8,

The Human Body from Topic 13, Technology and Communication is set for

question number 8 which will measures the knowledge level of Bloom’s

Taxonomy. Question number 9 is about to measures the knowledge level of the

Taxonomy under topic 9, Man & Human Body. For question numbers 9 and 10, it

comes under topic 9 too, which measure the comprehension level. Same goes to

questions number 12 and 13 which will measure the comprehension level of the

Taxonomy under topic 10, Man & Living Organism.

Topic 11, Nutrition & Food is placed in question number 14 which will measures

the student’s level of comprehension level of Bloom’s Taxonomy. Meanwhile

Knowledge level which is the lowest level of the Bloom’s Taxonomy is measured

in question 15, under topic 16 that is The Universe, Astronomy and Aerospace.

In the other hand, for question number 16, it comes under topic 7, Force &

Motion which will measures the comprehension level of the Taxonomy. Whereas,

topic 6 Matter & Mass comes under question 17 which will measures the

application level of Bloom’s Taxonomy. As for questions number 18, 19 and 20 it

all come under the same topic that is Matter & Mass from topic 6.

Comprehension level is test in question 18 and 19 while knowledge level of the

cognitive level is measure in question 20.

Apart from that, Man & Living Organism from topic 10 is set in question 21 which

will measures the knowledge level of the Bloom’s Taxonomy. As for questions

number 22 and 23, it covered topic 8 that is Human Body which both measure

the comprehension level of the Boom’s Taxonomy. Meanwhile topic 10, Man &

Living Organism are placed in question 24 and 25. In question 24, analysis level

of Bloom’s Taxonomy is tested whereas in question 25 will measure the

comprehension level. Nutrition & Food from topic 11 is covered in question 26

until 30. The questions will all measure the comprehension level of the Bloom’s

Taxonomy.

2.3.3 Administering

When the test is ready, all the 30 multiple-choice questions are given to

the students. Firstly we have to make sure that the Form 4 Science students of

Sekolah Menengah Kebangsaan Gajah Berang are ready for the test. There are

some suggestions to help students psychologically prepare for the test.

Firstly we maintain a positive attitude. We went to the school a week

before we distribute the test. Letting the students know that there will be a test

next week can encourage them a positive test-taking attitude. It helps keep the

main purposes of classroom testing in mind; to evaluate achievement and

instructional procedures and to provide feedback to us and also the students. By

doing this, falling victims to such testing traps can be avoided and maintain a

positive test-taking atmosphere. Secondly is maximizing achievement aspect of

the test. Encourage the students to do their best in test and not to immobilize

with fear. The test is something to be taken clearly and this should be clear to the

class.

Technically we went to the school to inform about the test a week earlier.

Such preparation can avoid surprises and also the students will have sufficient

advance of notice. This is not to say that the teacher should avoid frequent

quizzes. When students are tested frequently, learning or study takes place at

more regular intervals rather than study a night before a test. Letting the students

know about the test late will affect their expected performance that very important

and this will not evaluate their achievement.

In the classroom, before distribute the tests, we inform the students about

the time limits, restroom policy, and some of our special considerations. It is

important to inform the rules because usually students often fiddle with the rule

after they receive their tests and may miss important instruction. We started to

distribute the tests from left to right because allocating tests in this way will

prevent any students to get last paper in the class.

After distributing the tests to student, we remind the students to check

their copies. The item that should be checks in the tests are page numbers, the

questions, answer key and confirm whether they get the correct paper. Then we

let the tests begin and we monitor/ set the time limits for the tests.

We monitor the students while they answering the tests. We have to

make sure that they are not copying each others’ answers. During the monitoring

stage, we also inform the students do not cheat; there are punishments for

cheating. The reasons of avoiding cheating are because we can have precise

results of the students’ performance and we can evaluate the results correctly.

2.3.4 Scoring and Grading

Scoring is one of the important parts in evaluating students’ performance.

We have distributed 30 multiple choice questions among 20 students of SMK

Gajah Berang . The questions given are tested on subject English for Science &

Technology (EST) and the students sampling prepared by the school

administrator are from mix abilities. Students are from four different science

classes altogether. 5 Science 1, 5 Science 2, 5 Science 3, and 5 Science 4. After

collecting back the questionnaires from the students, we determine the scoring.

There are several steps required in calculating the scoring and grading of each

student.

First, preparing the answer keys is the utmost important steps in

determining students’ scores. Without the answer keys, scoring can be difficult

task and might be unreliable to score. Answer key will save time during the

scoring session and also classify whether the questions need to be eliminated or

not. During constructing the answer key, researchers can identify whether the

time for the tests enough for students. Our 30 multiple-choice questions for the

subject English for Science and Technology is appropriate for the time limits; 1

hour.

Also during scoring the tests, we sit together and check each other’s

answer key in order to identify possible alternative answers and potential

problems. Since, we did not know the students, we are not affected by the halo

effect and hence their marks are not affected because basically there are not

much information about the students’ background and performance provided by

their teacher. We did not return the tests back to students as we need to compile

them in our final report.

After scoring the tests correctly and accurately, the next thing that needs

to be done is grading the results. Grading or analyzing the test will determine

whether the test is valid or not. Basically, no test that the teachers had

constructed to their students will be perfect. It will include inappropriate or

otherwise deficient items. Thus in grading stage, a technique called Item Analysis

is very important. Item analysis is used to identify items that are defiant in some

ways. For example miskeying, guessing, and ambiguity.

Based on the results of the test, most of the question is functioning well.

The question is clear enough for students to understand it. The distracters of

every questions are well functioning and it is not difficult to the upper 10 students

to answer them. As informed by the teacher, students are already covered the

entire syllabus in the textbook since last year. Thus the possibility of guessing

item to occur is less. Unfortunately there are several questions that are

miskeyed, characterized by guessing and ambiguous.

Miskeying occurs when most students who did well in the test will likely to

choose wrong answer (distracter) rather than the correct answer. In question

number 2, most of upper students tend to choose distractor (A, B) than the

correct answer (C):

Question 2.

A B C* D

Upper half 4 5 1 1

Questions for number 16 and 19 also miskeyed. Most of the students in

upper class choose distracters (B, A) relatively than the correct answers (A, C):

Question 16.

A * B C D

Upper half 1 9 0 0

Question 19.

A B C* D

Upper half 6 0 3 1

In these cases, the key is not positively discriminate and the distractors

are attracting the students in upper half; discriminate positively. Basically revision

is necessary and if possible, eliminates the items.

There is one question we characterized it as guessing. In guessing, it is

most likely to occur when the item is (a) not covered in the class, (b) so difficult

that even the students have no idea what the correct answer is, or (c) so trivial

that students are unable to choose the option provided. As for question number

22, the question is not clear enough and the distracters are not well functioning:

Question 22.

A * B C D

Upper half 5 2 1 2

The question should be revised or eliminated. The option (A) is not clear

enough and most of the distracters have almost same level of frequency as the

option.

As for question number 27, the item is ambiguous. the distractor (D) is not

well functioning because it attracts same total numbers of students selecting the

correct answer. In this item, students who do well miss the item that are drawn

almost entirely to one of the distractor. thus the item should de revised.

Question 22.

A * B C D

Upper half 4 1 0 4

3.0 RESULTS

3.1 Frequency Table & Histogram

Class Intervals Tally Frequency

24 – 26 /// 3

21 – 23 /////

/////

///

13

18 – 20 /// 3

15 – 17 / 1

Table 3.1: Frequency Distributions of Students Score

Histogram Showing the Distribution of Scores of Form 5 Students of SMK Gajah Berang in EST test in 30 MCQs.

No. Of students/ Frequency

20

18

16

14

12

10

8

6

4

2

0 Scores

12 – 14 18 – 20 24 – 26

15 – 17 21 – 23 27 – 29

3.2 Measures of Central Tendency

3.2.1 Mean

21.8

3.2.2 Median

22

3.2.3 Mode

23

3.3 Measures of Dispersion/Variability

3.3.1 Range

9

3.3.2 Variance

4.8

3.3.3 Standard Deviation

2.19

3.4 Z-score and T-score

Z-score = X- X̄7 SD

T-score = 10Z+50

No. X̄ X̄- X̄7 Z- score T- score

1 26 4.2 1.92 69.2

2 25 3.2 1.46 64.6

3 24 2.2 1.00 60

4 23 1.2 0.55 55.5

5 23 1.2 0.55 55.5

6 23 1.2 0.55 55.5

7 23 1.2 0.55 55.5

8 23 1.2 0.55 55.5

9 22 0.2 0.09 50.9

10 22 0.2 0.09 50.9

11 22 0.2 0.09 50.9

12 22 0.2 0.09 50.9

13 21 - 0.8 - 0.37 46.3

14 21 - 0.8 - 0.37 46.3

15 21 - 0.8 - 0.37 46.3

16 21 - 0.8 - 0.37 46.3

17 20 - 1.8 - 0.82 41.8

18 19 - 2.8 - 1.28 37.2

19 18 - 3.8 - 1. 74 32.6

20 17 - 4.8 - 2.19 28.1

Table 3.4: Table showing Z-score and T-score of the subject EST for 20 students in SMK GAJAH BERANG

3.5 Item Analysis

3.5.1 Item Difficulty, P

P= No. of students choosing the correct answer No. of students

Table 3.5.1: Item analysis and distracters analysis of 30 multiple-choice questions of EST subject (item difficulty)

The difficulty of a test item that is scored right or wrong is indicated by the

fraction of students who get the item right. There are 20 questions in the item that

falls into the easy category which falls into the range of >0.70. the questions are

question number 3, 4, 5, 6, 8, 9, 12, 13, 14, 15, 16, 17, 18, 20, 21, 25, 26, 28, 29

No. ITEM DIFFICULTY (p)1 0.552 0.103 0.754 0.905 0.956 0.807 0.358 0.709 1.0010 0.5511 0.5512 1.0013 1.0014 1.0015 0.9516 0.2517 1.0018 0.8519 0.3020 0.8021 0.8022 0.4023 0.5024 0.9025 0.9026 1.0027 0.2028 0.9029 0.8530 0.90

and 30. 8 questions fall into the moderate difficulty category, these ranges from

0.30 to 0.69. While, there is only 3 questions that falls into the difficult category,

which are question number 2, 16 and 20. Whereby the calculation of item

difficulty will show the value of less than 0.29

3.5.2 Item Discrimination, D

D= (No. of students who chose the correct answer in the upper group) - (No. of student who chose the correct answer in the lower group) (No. of students in each group)

Table 3.5.2: Table showing item difficulty and discrimination of 30 EST question distributed to

20 students of SMK GAJAH BERANG

No. ITEM DISCRIMINATION (d)1 0.302 0.003 0.304 0.005 0.106 0.007 0.108 0.409 0.0010 0.7011 0.1012 0.0013 0.0014 0.0015 -0.9016 -0.3017 0.0018 0.3019 0.0020 0.0021 0.4022 0.2023 0.2024 0.2025 0.2026 0.0027 0.3028 0.2029 -0.1030 0.00

There are 15 questions that have no discrimination (0.00 or negative values) at all. Those items are questions number 2, 4, 6, 9, 12, 13, 14, 15, 16, 17, 19, 20, 26, 29, 30. Meanwhile, there are 9 questions that falls into the moderate discrimination which ranges from 0.2 to 0.39. Those questions are questions number 1, 3, 18, 22, 23, 24, 25, 27, 28. Low discrimination, 0.1 to 0.19 were found in question number 5, 7 and 11. Lastly, there are only three questions with high discrimination, more than 0.4. The questions numbers are 8, 10 and 21.

4.0 DISCUSSION

4.1 Histogram

XO mdn mode

15 20 25 30

Negatively skewed

: 21.8X̄�Median: 22Mode: 23

Based on the graph, a negatively skewed distribution showed that the

mean has the lowest score that is 21.8 while the median in the middle with the

intermediate score 22 and the mode is the highest score 23. The negatively

skewed distribution indicates that the class did very well in the test with a majority

of them have high scores and only few had lower scores. Most of the students

have scored between 21 and 23 meaning that they can be class as a good

students or homogenous which mean all of them have almost the same ability

compare to one another. Again there could be many reasons for this. The test

may have been too easy due to the familiarity of the type of question since all of

the student have covered all the topic and done a lot of past year papers and

exercises in the classroom. Therefore it easy for them to score the test when

there are might be similar question in the test that they have done on their own.

Moreover the students are also exceptionally brilliant since they all come from

the first science class which their placement in the class was based on their

performance and achievement in the school academic.

4.2 Measures of Central Tendency

The mean is the average score of the student in the test. We can see

from the graph that most of the student had average score which is 21.8. This

indicates that almost all the student did well in the test with an average scored.

The mean has several characteristics that make it the measure of central

tendency most frequently used. One of these characteristic is stability. Since

each score in the distribution enters into the computation of the mean, it is more

stable over time than other measures of central tendency which consider only

one or two scores. Another characteristic is that the sum of each score’s distance

from the mean is equal to zero. A third characteristic is that the mean is affected

by extreme scores. This means a few very low scores of 20 or below in the

negatively skewed distribution will pull the mean down toward them. Thus, the

mean score gives an impression that the typical student scored about 21 and

pass the test with an A grade while the student below the mean score still pass

the test but with the C grade.

The median is the score that splits a distribution in half. 50 percent of the

scores lie above the median and 50 percent of the score lie below the median. It

also can be describe as the middle scores since its falls in the middle of the

distribution scores. The score distribution show that 50 percent of the students

scored 22 and above on the test and which mean half of the students past the

test with an A grade. While the other half falls between A and B grade.

The mode is the most frequently scores occur in the distribution. Based

on the graph we has unimodal mode which mean only one score that most

frequently occurred in the student’s scores that is 23. The mode also indicate that

the many students score highly in the test with an A grade. So we can conclude

that most of the students in this class are good students and most of them pass

the test with an A grade.

4.3 Measures of Standard Deviation

The purpose of measures of variability is to show how the scores are spread from the mean. It is important because the measures of variability will determine in which group the majority of the students are, the good or the weak.

Range is the simplest measures of variance, calculated by subtracting the lowest score from the highest score. The range provides a quick estimate of variability but is undependable because it is based on the two positions of two extreme scores. The addition or subtraction of a single score can change the range significantly. As for our research, we need to arrange the students’ scores from the highest score to the lowest score; starting from the score 26 over 30 until 17 over 30. The range of the data is 9.

Standard Deviation (SD) is the most useful measure of variability. The calculation of the standard deviation does not make its meaning readily apparent, but essentially it is an average of the degree to which a set of scores deviate from the mean the procedure for calculating a standard deviation involves squaring each score and taking a square root. In overall, the calculation of the standard deviation needs the help from scientific calculator.

In order to calculate the standard deviation, firstly we need to calculate the mean first. Next subtract the mean from the raw scores, accordingly from the highest score to the lowest score. Then, square the results from the subtraction. Lastly is taking the square roots from each score so thus the calculation of the standard deviation can be ended. Here is the formula of standard deviation:

The variance from the total score of the tests is 4.80. The standard deviation of the calculation is 2.19. Having the correct calculation of standard deviation can help to evaluate the students’ performance on the tests. Standard deviation is also important in calculating the Z-score and the T-score.

4.4.1 Z-score and T-score

Z-score is the simplest of the standard scores. This score expresses test performance simply and directly as the number of standard deviation units a raw score is above, or below the mean.

A Z-score is always positive when the raw score is bigger than mean. In our tests, we have 12 students that have positive Z-scores; 1.92, 1.46, 1.00, five of them get 0.55, and four of them get 0.09. As a Z-score is always positive when the raw score is bigger than mean, in contrast, a Z-score is always negative when the raw score is smaller than the mean. The lowest result of Z-score is -2.19, following by -1.74, -1.28, and four students share same results; -0.37. Forgetting the negative sign (-) can cause serious errors in test interpretation. For this reasons, Z-scores are seldom used directly in tests norms but are usually transformed into a standard score system that use only positive numbers; the term of T-score.

T-score has become to refer to any set of normally distributed standard score that has a mean of 50 and a standard deviation of 10. T-score can be obtained by multiplying the Z-score by 10 and adding the calculation to 50.

One reason that T-score is preferable to Z-score for reporting the test results is that only positive integers (+) are produced. The results of T-score of Sekolah Menengah Kebangsaan Gajah Berang were calculated and listed from the highest to the lowest.

4.5 ITEM ANALYSIS AND DISTRACTER ANALYSIS

Question 1

1. Is the item difficulty level appropriate for the testing application

The item difficulty with 0.55 proves that it is appropriate for the testing application which is analysis. The question asks students to conclude what the short passage is about.

2. Does the item discriminate adequately?

With item discrimination of 0.3, the item does a satisfactory job in discriminating between examinees who performed well on the test and those doing poorly

3. Are the distracters performing adequately

Option A does not function for the item as it does not attract any of the students to choose it. As for B it is a weak distracter because it attracts one student from the upper group while none from the lower. Option C is a good distracter because it attracts more of the weak students that the good students.

4. Overall evaluation

This item need to checked and revised because there are weaknesses in the distracters as mentioned above.

Question 2


The item difficulty with 0.1 proves that the item is difficult. However, it is appropriate for the synthesis application. The item asks students to come up with a motto from what they have read.


There is no discrimination for this item because the number of students between the upper and lower group choosing the correct answer is the same.


Option A is a good distracter because it attracts more of the lower group students. Option B is a weak distracter because more of the upper group students are attracted to it. As for D, it is a non-functioning distracter because it does not attract any of students from both the group.


This item is eliminated because it cannot discriminate between those who performed well those performing poorly in the test.

Question 3


A p of 0.75 shows that the item is easy as it is only testing the students knowledge when they read the passage.


With item discrimination of 0.3, the item does a satisfactory job in discriminating between examinees who performed well on the test and those doing poorly

3. Are the distracters performing adequately?

Option A is functioning well. C is a non-functioning distracter while D is a good distracter


This item need to checked and revised because there are weaknesses in the distracters as mentioned above and the discriminating power of the item is showing only satisfactory job

Question 4


The item difficulty is 0.9 this shows that the item is too easy. The optimal mean p value for a multiple-choice question item with four choices is 0.74


This item has no discrimination.


Option A and C is a weak distracter. Distracter D is non-functioning


This item needs to be eliminated because it has no discrimination. It is not effective to test students’ performance

Question 5


The item difficulty with 0.95 suggests that this item is too easy


With item discrimination of 0.1, it indicates that the item has low discrimination


All of the distracters are not performing adequately with option A being weak and B and C is non-functioning


This item is eliminated or rewritten in a new way with improved distracters.

Question 6


The item difficulty with 0.8 proves that this item is easy


The item does not discriminate adequately because it has 0 value of discrimination.


All of the distracters are not performing adequately with option A and B being non-functioning while option C is a weak distracter


This item is eliminated

Question 7


The item difficulty with 0.35 shows that this item has moderate difficulty


D value of 0.1 suggests that this item has low discrimination


Distracter A and B are performing adequately while distracter C is not


This item is checked and revised

Question 8


The item difficulty with 0.7 proves that this item is easy


The D value of 0.4 suggests that this item has high discrimination


Some of the distracters are not performing adequately with option A and B being non-functioning while only option C works as a good distracter


This item is retained. However, the distracters need to be improve

Question 9


The item difficulty with 1 proves that this item is too easy


The item does not discriminate adequately because it has 0 value of discrimination.


All of the distracters are not performing adequately because all of the students answer the correct answer



Question 10


The item difficulty with 0.55 proves that this item has moderate difficulty


The item discriminates adequately because it has 0.7 value of discrimination.


Other distracters are performing adequately except for distracter C


This item is retained but distracter C need to be improved

Question 11


The item difficulty with 0.55 suggests that this item has moderate difficulty


This item has 0.1 discrimination, making it a low discrimination


Distracter A and C is not performing adequately


This item is eliminated or rewritten

Question 12


The item difficulty with value of 1 suggests that this item is too easy


With item discrimination of 0, it indicates that the item has no discrimination


All of the distracters are not performing adequately because all of the students answer correctly



Question 13









Question 14









Question 15


The item difficulty with value of 0.95 suggests that this item is too easy


With item discrimination of negative value -0.9, it indicates that the item has no discrimination


All of the distracters are not performing adequately because all of the students in the lower group answer correctly



Question 16


The item difficulty with value of 0.25 suggests that this item is difficult


With item discrimination of negative value -0.3, it indicates that the item has no discrimination


Distracters are not performing adequately except for distracter D



Question 17









Question 18


The item difficulty is 0.85 this shows that the item is easy.


This item has moderate discrimination, 0.3


Not all of the distracters performing adequately. Option B is non-functioning, D is a weak distracter with only option C being a good distracter


This item needs to be checked and revised

Question 19


The item difficulty is 0.3 shows that the item has moderate difficulty


This item has no discrimination, 0


The distracters are not performing adequately. A is a weak distracter. B is non-functioning and D works as a good distracter.



Question 20




This item has zero value of discrimination


Not all of the distracters performing adequately. A and D is a weak distracter while C is non-functioning



Question 21






Not all of the distracters performing adequately. Option B is non-functioning, D is a weak distracter with only option C being a good distracter



Question 22


Item difficulty of 0.4 shows that the item has moderate difficulty




Distracter B and D is not performing adequately because it attracts the same amount of students from both the upper and lower group



Question 23


The item difficulty is 0.5 this shows that the item has moderate difficulty




Distracter C and D are not performing adequately. Distracter A is performing adequately because it attracts more students from the lower group

4. Overall evaluationThis item needs to be checked and revised

Question 24


The item difficulty is 0.9, this shows that the item is easy.




Only option A performs adequately. C and D is a non-functioning distracter



Question 25


The item difficulty of 0.9 shows that the item is too easy.




All of the distracters are not performing adequately. Option B is non-functioning, while C and D is a weak distracter



Question 26









Question 27









Question 28


The item difficulty with value of 9 suggests that this item is easy


With item discrimination of 0.2, it indicates that the item has moderate discrimination


All of the distracters are not performing adequately because all of it is weak


This item needs to be checked and revised. The distracters need to be change or rewritten

Question 29


The item difficulty with value of 0.85 suggests that this item is easy


With item discrimination of -0.1, it indicates that the item has no discrimination


All of the distracters are not performing adequately because all option A is a weak distracter while B and D is non-functioning



Question 30


The item difficulty with value of 0.9 suggests that this item is too easy




All of the distracters are not performing adequately. A and B is a weak distracter while D is non-functioning



4.6 LIMITATIONS

4.6.1 Inexperience

Our inexperience with coming up with a good question is one of the

limitations faced. This has been point out by the teacher after we go to

the school for follow-ups with the teacher. Puan Ee pointed out that we

should not extract the question from the exercise book alone. If we want

to extract questions, we should have extracted it from few materials

4.6.2 Lack of Reliable Materials

Much to our mortification, many of the books in the market is not reliable.

For example, the questions that we have extracted there are only three

items that can be retained. Others need to be improved or eliminated.

4.6.3 Students’ lack of preparation

This can be seen when the students were quite shocked when we tell

them that they need to answer a test. Lack of preparation from the

students can affect their performance. Some of the students also said to

us that they take the test easily and basically they just guess or peek at

other students’ answer

Apart from that, we have forgotten when we photocopied the questions

we were not aware of question 11, 12 and 13 have an answer. This is due

to our carelessness at the planning stage when we were going through

the question for the answer, one of the member ticks it.

5.0 CONCLUSION

In conclusion, after conducting this research, we found that we have gained a lot

of meaningful information for our future use. The most crucial and harder part is on

planning the questions. It is because there are a lot of things to be considered on this

stage. The format of the questions that we want to construct, questions stringently

followed the Bloom’s Taxonomy of Educational Objectives, and which level of proficiency

of students that wants to be tested. All of these things are put into consideration when

constructing questions. After that, grading and scoring stage occur. At this stage, one

must carefully revise their calculation. Any number missing will affect the rest of the

calculation. Questions that have been chosen in the exercise book or revision book

should be examined thoroughly before putting it in the final draft of the test. Try to

minimize the numbers of questions that are miskeying, skewed, or ambiguous are often

found in any exercise or revision book. It is recommended for the teacher to use their

own item banks, if any. As we noticed, our students sampling is from homogenous

group. They possess similar abilities in terms of academic achievement. Finally, we have

learnt many things in this course especially things that we are going to apply for our

future use. We found that this course help us to be prepared as we are going to be a

teacher or educator later on.

Appendices

1. Syllabus2. Table of Specifications3. Sample 30 multiple-choice questions4. Answer to the 30 multiple-choice questions5. Sample multiple-choice score sheet6. Marked score sheets (MCQ) of the students7. Item bank questions8. Calculation of Z-score and T-score 9. Calculations of item difficulty and item discrimination

4. Answer to the 30 multiple-choice questions

Section A

1. A2. C3. B4. C5. C6. D7. D8. C9. C10. A11. D12. B13. C14. B15. B16. A17. A18. A19. C20. B21. B22. A23. A24. B25. A26. C27. A28. A29. C30. C

8. Calculation of Z-score and T-score

no X Z - Score T - Score1 26 26 - 21.8 10 (1.92) + 50

2.191.92 69.2

2 25 25 - 21.8 10 (1.46) + 502.191.46 64.6

3 24 24 - 21.8 10 (1) + 502.19

1 604 23 23 - 21.8 10 (0.55) + 50

2.190.55 55.5

5 23 23 - 21.8 10 (0.55) + 502.190.55 55.5

6 23 23 - 21.8 10 (0.55) + 502.190.55 55.5

7 23 23 - 21.8 10 (0.55) + 502.190.55 55.5

8 23 23 - 21.8 10 (0.55) + 502.190.55 55.5

9 22 22 - 21.8 10 (0.09) + 502.190.09 50.9

10 22 22 - 21.8 10 (0.09) + 502.190.09 50.9

11 22 22 - 21.8 10 (0.09) + 502.190.09 50.9

12 22 22 - 21.8 10 (0.09) + 502.190.09 50.9

13 21 21 - 21.8 10 (- 0.37) + 50

2.19-0.37 46.3

14 21 21 - 21.8 10 (- 0.37) + 502.19-0.37 46.3

15 21 21 - 21.8 10 (- 0.37) + 502.19-0.37 46.3

16 21 21 - 21.8 10 (- 0.37) + 502.19-0.37 46.3

17 20 20 - 21.8 10 (- 0.82) + 502.19-0.82 41.8

18 19 19 - 21.8 10 (- 1.28) + 502.19-1.28 37.2

19 18 18- 21.8 10 (- 1.74) + 502.19-1.74 32.6

20 17 17- 21.8 10 (- 2.19) + 502.19-2.19 28.1

9. Calculations of item difficulty and item discrimination

no Item Difficulty, P Item Discrimination, D

1 7+4 7−420 10

0.55 0.32 1+1 7−4

20 100.1 0.3

3 9+6 9−620 10

0.75 0.34 9+9 9−9

20 100.9 0

5 10+9 10−920 10

0.95 0.16 8+8 8−8

20 100.8 0

7 4+3 4−320 10

0.35 0.18 9+5 9−5

20 100.7 0.4

9 10+10 10−1020 101 0

10 9+2 9−220 10

0.55 0.711 6+5 6−5

20 100.55 0.1

12 10+10 10−1020 101 0

13 10+10 10−1020 101 0

14 10+10 10−1020 101 0

15 9+10 9−10

20 100.95 -0.9

16 1+4 1−420 10

0.25 −0.317 10+10 10−10

20 101 0

18 10+7 10−720 10

0.85 0.319 3+3 3−3

20 100.3 0

20 8+8 8−820 100.8 0

21 10+6 10−620 100.6 0.4

22 5+3 5−320 100.4 0.2

23 6+4 6−420 100.5 0.2

24 10+8 10−820 100.9 0.2

25 10+8 10−220 100.9 0.2

26 10+10 10−1020 101 0

27 4+2 4−220 100.3 0.2

28 10+8 10−820 100.9 0.2

29 8+9 8−920 10

0.85 −0.130 9+9 9−9

20 100.9 0.3

References

Kubiszyn, T., and Borich, G. (1990). Educational Testing and Measurement (3rd Edition). Moterey, CA: Harper Colins Publishers.

Miller, M. D., Linn, R. L., and Gronland, N. E. (2009). Measurement and Assessment in Teaching (10th Edition). Upper Saddle River, N. J.: Pearson Publication, Inc.

Reynolds, C. R., Livingston, R. B., and William, V. (2006). Measurement and Assessment in Education. Boston, MA: Pearson Education, Inc.

Sax, G. (1989). Principles of Educational and Psychological Measurement and Evaluation (3rd Edition). Belmont, CA: Wadsworth Publishing Company.

UNIVERSITI TEKNOLOGI MARA

KAMPUS BANDARAYA MELAKA

Prepared for:

DR. DAVID LOH ER FUU

PRINCIPLES OF TESTING AND EVALUATION (TSL 480)

UNIVERSITI TEKNOLOGI MARA

KAMPUS BANDARAYA MELAKA

Prepared by:

NOOR IZZATI MUHAMAD NASIR

2007297688

NOOR ALINA NAMAMI

2007297732

MUHAMMAD NABIL MUSTAFA

2007297686

ADI FARHAN GHAZALI

2007297782

Student of B. Ed. TESL

Faculty of Education

UiTM KAMPUS BANDARAYA MELAKA

20th April 2009

final

Documents

Transcript of final