4 STANDARDIZATION OF THE TEACHING APTITUDE...

170

CHAPTER – 4

STANDARDIZATION OF THE TEACHING

APTITUDE TEST

4.1 Introduction

4.2 Teaching Aptitude Test

4.2.1 Pre-Pilot Test

4.2.1.1 Administration of Pre-Pilot Test

4.2.1.2 Sample

4.2.1.3 Time Limit

4.2.1.4 Method of Scoring

4.2.1.5 Item Analysis

4.2.1.6 Discriminative power

4.2.1.7 Difficulty Index

4.2.2 Pilot Test

4.2.2.1 Selection of Test Items

4.2.2.2 Sample

4.2.2.3 Administration of Pilot Test

4.2.2.4 Element of Time

4.2.2.5 Scoring Scheme

4.2.2.6 Analysis and Interpretation

4.2.3 Final Test

4.2.3.1 Construction of Final Test

171

4.2.3.2 Final Test and Answer sheet

4.2.3.3 Sample for the Final Test

4.2.3.4 Administration of the Final Test

4.2.3.5 Time Limit

4.2.3.6 Scoring the Test

4.3 Establishment of Norms

4.3.1 Difference between the Mean Teaching

Aptitude Scores of Male and Female B.Ed.

Students

4.3.2 Deciding the Nature of the Distribution of

Test Scores

4.3.3 Standard Score and T scores

4.3.4 Percentile Rank

4.4 Reliability of the Test

4.5 Validity of the Test

4.6 Conclusion

172

CHAPTER – 4

STANDARDIZATIONOF THE TEACHING

APTITUDE TEST

4.1 Introduction :

In this chapter the researcher has discussed in detail about the

process of standardization of the tool –Teaching Aptitude Test. The focal

point of this chapter is the reliability and validity of the tool standardized.

The fundamental purpose of standardizing, a psychological test is to

establish its reliability and its validity at as high a level as possible. The

techniques of establishing reliability and validity are discussed in this

chapter. The Norms establishment was the second most important part of

this chapter. To standardize the test the researcher had administered the

tool following the three steps – Pre-Pilot, Pilot, and Final test. The

administration and results of Pre-Pilot Test, Pilot Test and Final Test of

Teaching Aptitude Test are discussed in the beginning of the chapter.

4.2 Teaching Aptitude Test

As discussed in the previous chapter-3: ‘Planning and Procedure of

the Study’, the first two steps to construct the teaching aptitude test was –

Job Analysis and Tentative Selection or Construction of Tests, the rest

of three steps as suggested by Garret1, have been discussed in this

chapter. The steps are as follows:

o EXPERIMENTAL TRYOUT OF THE TEST

o SETTING-UP DIRECTIONS FOR ADMINISTRSTION AND

SCORING; THE ESTABLISHMENT OF NORMS

173

o FOLLOW-UP STUDIES TO DETERMINE THE PREDICTIVE

VALUE OF THE TEST BATTERY IN SELECTION AND IN

VOCATIONAL GUIDANCE

o EXPERIMENTAL TRYOUT OF THE TEST

4.2.1 Pre-Pilot Test

Pre-Pilot Test is considered to be a very important step in the

standardization of a test. Pre-Pilot From of a test has certain basic

objectives, such as

o To identify weak or defective items from a lot.

o To indicate the need of improvement.

o To find out ambiguity from an item and either improve it or

remove it from the test to make it self-explanatory.

o To find out efficiency of the distracters.

o To determine the discriminative power of each item in order so that

all items selected may contribute to central purpose of finished test

and together constitute an efficient measuring instrument.

o To determine difficulty index of each item in order to arrange them

in sequence.

o To find out appropriate time limit for the final form of the test.

o To study efficiency of instructions to examiners and examinee.

The try-out test was planned with a view to these objectives.

4.2.1.1 Administration of Pre-Pilot Test

Pre-Pilot Test was prepared on the basis of teacher’s traits

selected with the help of five point scale. Since this was the first

step in the process of this research, the researcher had administered

174

personally in Four B. Ed. Colleges in Gujarat State. With the help

of this step the researcher could detect ambiguity and otherwise in

the instructions for the test. The researcher had also tried to have an

eye on the time required by average respondents in taking the test

in order to see that the test did not become too lengthy. The

administration of the test personally also helped in knowing if the

test items were ambiguous, too difficult or otherwise since the

proctor had established enough rapport to obtain free reactions

from the respondents.

4.2.1.2 Sample :

The sample for administering the Pre-Pilot Test was drawn

from four education colleges of Gujarat State. The College-wise

and sex-wise classification of the sample consisted of total 224

respondents was tabulated in the table 4.1 as follows:

TABLE 4.1

College & Sex-wise Distribution of the

sample for Pre-Pilot Test

Sr.

No. Name of College Male Female Total

1 H. M. Patel Institute of Education,

Research and Training, Vallabh

Vidyanagar, Dist. Anand (Grant-in-aid)

21 30 51

2 Sarvajanik College of Education, Godhra,

Dist. Panchmahals (Government)

18 42 60

3 Way Made College of Education, Vallabh

Vidyanagar, Dist. Anand (Self-financed)

16 37 53

4 Singwad B. Ed. College, Singwad, Dist.

Dahod (Self-financed)

14 46 60

Total: 69 155 224

175

From the table 4.1 it can be seen that on an average 50 to 60

students were selected from each four colleges. The Pre-Pilot Test

was administered in the first term i.e. in the month of August of the

academic year 2010-2011.

4.2.1.3 Time Limit

This test was a power test. Anastasi2 defines power test in the

following words :

“A pure power test has a time limit long enough to permit

everyone to attempt all items.”

According to Gilford3;

“A power test is often defined as one in which every

examinee has a chance to attempt every item.”

Keeping these definitions in mind, no time limit was fixed

for the Pre-Pilot Test. Respondents were informed that they would

have as much time as they needed for the completion of this test.

The maximum time spent to attempt the total 183 items of

the test by average respondents was two hours. Thus, they spent

approximately 40 seconds to answer an item. This process helped

the researcher to estimate the total time limit for the final test.

4.2.1.4 Method of Scoring

As it has been already discussed in the foregone pages, this

test had 5 sub-tests where the 5th

sub-test was once again divided

into 3 sections. Correctness or otherwise of responses to each item

of the sub-tests had been decided by consultation and guidance of

the experts of the concerned test.

176

The scoring scheme followed by the researcher was as

follows:

SUB-TEST 1: INNOVATION-RESEARCH IN EDUCATION

AND INTEREST & ATTITUDE TOWARDS TEACHING

This sub-test contains YES (Y), NEUTRAL (U), NO (N)

type of questions. Each correct choice(Y) carried ‘2’ marks, neutral

choice (U) carried ‘1’ mark and ‘0’ mark was given for incorrect

choice (N). The scheme was as follows:

For each positive statement:

‘2’ marks for YES (Y)

‘1’ mark for Neutral (U)

‘0’ mark for NO (N)

For each negative statement:

‘2’ marks for NO (N)

‘1’ mark for Neutral (U)

‘0’ mark for YES (Y)

Each statement carried the maximum ‘2’ marks and the

minimum ‘0’ mark. The total marks for this sub-test could be

obtained by number of total marks obtained divided by 2 i.e. T/2.

There were total 26 items in this sub-test. The maximum

possible score on this sub-test, thus, was 52 and the minimum

attainable score was 0 (zero). The obtained score was divided by a

constant so that the maximum attainable score 26 and the minimum

0 (zero) were considered for the calculation.

177

SUB-TEST II: TEACHER’S EXCELLENCY IN SUBJECT-

CONTENT, TEACHING METHODS AND EVALUATION IN

EDUCATION

This sub-test contains multiple choice types of questions. It

had four alternative choices A, B, C, D, respectively for each item.

Each correct choice carried ‘1’ mark and ‘0’ mark was given for

any other choice. The maximum and the minimum possible score

on this sub-test was ‘39’ and ‘0’ respectively.

SUB-TEST III: TEACHER’S COMMITMENT

This sub-test contains the same types of questions as in sub-

test I. The maximum possible score on this sub test was ‘27’ and

the minimum was ‘0’.

SUB-TEST IV: TEACHER’S HUMAN RELATIONSHIPS AND

SOCIAL ACCOUNTABILITY

This sub test contains a 5 point scale.

For each positive statement,

‘4’ marks for Fully Agree (SA)

‘3’ marks for Agree (A)

‘2’ marks for Neutral (U)

‘1’ mark for Disagree (D)

‘0’ mark for Fully Disagree (SD)

For each negative statement,

‘4’ marks for Fully Disagree (SD)

‘3’ marks for Disagree (D)

178

‘2’ marks for Neutral (U)

‘1’ mark for Agree (A)

‘0’ mark for Fully Agree (SA)

Each statement carried the maximum ‘4’ marks and the

minimum ‘0’ mark. The total marks for this sub-test could be

obtained by number of total marks obtained divided by 4, i. e. T/4.

There were 26 items in this sub-test. The maximum possible

score on this sub-test thus, was 104 and the minimum attainable

score was ‘0’ (zero). The obtained score was divided by a constant

(4) so that the maximum possible score ‘26’ and the minimum ‘0’

was considered for calculation.

SUB-TEST V: MENTAL ABILITIES

This sub-test was again divided into 3 sections (Section I, II

and III). All these sections contain same types of questions as in

sub-test II. The maximum possible score on this sub-test was 65.

Thus, the maximum possible score on sub-tests I to V was 183.

4.2.1.5 Item Analysis

Item analysis of the test gives two kinds of information. It

gives an idea about the difficulty index of the item and an index of

validity. Here, the item validity means how well the item measures

or discriminates among respondents who score high and those who

score low on the test as a whole. This information is valid for many

reasons. It provides an opportunity to check up the right items in

the tests. That is why, it is always desirable to include surplus items

in the Pre-Pilot Test, so that the items appearing best in terms of

179

item statistics, can be selected for the try out and final form of the

test.

4.2.1.6 Discriminative power

Many techniques have been developed to show the degree to

which an item is effective to discriminate high and low ability

pupils. In the present test, T. L. Kelley’s method of 27 % was

adopted, which is used for forming two extreme groups on the

basis of total score of the test.

The test booklets of 224 respondents were selected. Test

booklets, which were found incomplete, were rejected. The total

224 booklets were rearranged in descending order of scores. Then,

27% of booklets from the top and 27% from the bottom, i.e. 60

booklets from both the ends were separated for the purpose of item

analysis. The middle 46% were not considered for item analysis.

The next step was to find out the number of respondents

answering each item correctly from the Upper 27% of the group

and the Lower 27% of the group. These two groups were entitled as

the Upper Group and the Lower Group. Summary of item analysis

is presented in Table 4.3.

Discriminative power of the items was read directly from the

Flanagan Table, a table of the value of the Product-Moment Co-

efficient of Correlation in Normal Birariate population

corresponding to the given proportion of success. It is supported by

Green Gerberich4 :

“This method of determining the Discriminative power of

test items is widely used in the critical analysis of the test items for

standardised test.”

180

4.2.1.7 Difficulty Index

In the test construction, it is a common practice to attempt to

construct items covering a wide range of difficulty. According to

Green Gerberich5,

“The test as whole should have about 50% difficulty for

average pupils.”

Therefore, items should not be so easy as to be passed by

every participant of the group nor it should be too difficult so as to

be failed by every participant of the group, because neither of these

extreme cases makes the item contribute to the discrimination

which the test is to make among different individuals.

Difficulty values or indices of the items of the present test

were determined by using the data obtained from 27% of each of

the upper and lower groups. For this, the percentages of

respondents answering the items correctly from the upper group

and from the lower group were added and the sum was divided by

2 (two).

The same is mentioned below symbolically along with

illustrations:

Difficulty Index (DI) = (Upper Group + Lower Group)/2

Illustration: Test I : Item 1. DI = (82 + 42) / 2 = 62

Test II : Item 1. DI = (72 + 35) / 2 = 53

In this way, the difficulty index of each item was computed.

The Table 4.2 shows the number of respondents from Upper group

and Lower group answering each item correctly, difficulty index

and discriminative value of the Pre-Pilot Test. An item retained or

rejected was also shown in the last column of the table.

181

TABLE 4.2

No. of Respondents from Upper and Lower Group answering each

item correctly. The Difficulty Index and Discriminative

Value of the Pre-Pilot Test

ITEM

NO.

(Rh)

UPPER GROUP

NO. OF

CORRECT

RESPONCES

% OF

CORRECT

RESPONCES

R1

LOWER

GROUP NO.

OF CORRECT

RESPONCES

% OF

CORRECT

RESP0NCES

DIF

FIC

ULTY

IND

EX

DIS

CR

IMIN

ATIV

E

VA

LU

E

REMARKS

1 2 3 4 5 6 7 8

SUB-TEST : 1 INNOVATION-RESEARCH IN EDUCATION AND INTEREST AND ATTITUDE TOWARDS

TEACHING 1 49 82 25 42 62 43 Retained

2 34 57 15 25 41 34 Retained

3 38 63 22 37 50 27 Retained

4 47 78 34 57 68 25 Retained

5 25 42 21 35 38 09 Rejected

6 59 98 23 38 68 73 Retained

7 52 87 19 32 59 56 Retained

8 36 60 30 50 55 10 Rejected

9 42 70 24 40 55 31 Retained

10 50 83 37 62 73 25 Retained

11 46 77 13 22 49 54 Retained

12 48 80 18 30 55 51 Retained

13 28 47 26 43 45 04 Rejected

14 30 50 27 45 48 06 Rejected

15 51 85 29 48 67 40 Retained

16 55 92 30 50 71 53 Retained

17 48 80 37 62 71 22 Retained

18 40 67 28 47 57 21 Retained

19 44 73 16 27 50 46 Retained

20 47 78 22 37 58 41 Retained

21 57 95 25 42 68 61 Retained

22 56 93 18 30 62 65 Retained

23 50 83 28 47 65 39 Retained

24 38 63 18 30 47 33 Retained

25 48 80 24 40 60 42 Retained

26 36 60 22 37 48 25 Retained

SUB-TEST: 2 TEACHER’S MASTARY IN SUBJECT-CONTENT, TEACHING METHODS AND EVALUATION IN

EDUCATION

1 43 72 21 35 53 39 Retained

182

1 2 3 4 5 6 7 8

2 23 38 19 32 35 07 Rejected

3 25 42 13 22 32 23 Retained

4 43 72 27 45 58 29 Retained

5 32 53 15 25 39 30 Retained

6 40 67 29 48 58 19 Rejected

7 44 73 30 50 62 23 Retained

8 35 58 14 23 41 38 Retained

9 25 42 16 27 34 18 Rejected

10 39 65 22 37 51 29 Retained

11 27 45 14 23 34 25 Retained

12 26 43 34 57 50 -14 Rejected

13 45 75 29 48 62 28 Retained

14 42 70 27 45 58 27 Retained

15 52 87 10 17 52 68 Retained

16 36 60 22 37 48 25 Retained

17 46 77 26 43 60 36 Retained

18 53 88 33 55 72 41 Retained

19 50 83 18 30 57 53 Retained

20 35 58 25 42 50 16 Rejected

21 43 72 13 22 47 50 Retained

22 38 63 20 33 48 31 Retained

23 47 78 31 52 65 29 Retained

24 28 47 37 62 54 -16 Rejected

25 44 73 26 43 58 31 Retained

26 37 62 22 37 49 27 Retained

27 49 82 19 32 57 51 Retained

28 55 92 16 27 59 68 Retained

29 36 60 24 40 50 21 Retained

30 39 65 29 48 57 17 Rejected

31 40 67 23 38 53 29 Retained

32 42 70 25 42 56 29 Retained

33 34 57 17 28 43 29 Retained

34 31 52 14 23 38 33 Retained

35 35 58 20 33 46 27 Retained

36 30 50 17 28 39 23 Retained

37 53 88 18 30 59 60 Retained

38 38 63 21 35 49 29 Retained

39 47 78 32 53 66 29 Retained

SUB-TEST : 3 TEACHER'S COMMITMENT

1 52 87 38 63 75 31 Retained

2 45 75 35 58 67 18 Rejected

3 55 92 40 67 79 38 Retained

4 25 42 28 47 44 -04 Rejected

183

1 2 3 4 5 6 7 8

5 42 70 25 42 56 29 Retained

6 54 90 43 72 81 28 Retained

7 44 73 22 37 55 37 Retained

8 48 80 29 48 64 35 Retained

9 49 82 36 60 71 27 Retained

10 51 85 40 67 76 24 Retained

11 39 65 21 35 50 31 Retained

12 35 58 16 27 43 33 Retained

13 28 47 18 30 38 17 Rejected

14 38 63 24 40 52 22 Retained

15 40 67 26 43 55 25 Retained

16 56 93 42 70 82 35 Retained

17 39 65 20 33 49 31 Retained

18 45 75 28 47 61 30 Retained

19 53 88 38 63 76 34 Retained

20 58 97 46 77 87 40 Retained

21 35 58 19 32 45 27 Retained

22 49 82 38 63 73 25 Retained

23 53 88 43 72 80 24 Retained

24 56 93 40 67 80 38 Retained

25 51 85 32 53 69 37 Retained

26 38 63 28 47 55 16 Rejected

27 48 80 36 60 70 24 Retained

SUB-TEST : 4 TEACHER'S HUMAN RELATIONSHIPS AND SOCIAL DEDICATION

1 46 77 33 55 66 24 Retained

2 36 60 26 43 52 18 Rejected

3 40 67 22 37 52 31 Retained

4 37 62 18 30 46 33 Retained

5 47 78 28 47 63 34 Retained

6 49 82 36 60 71 27 Retained

7 48 80 30 50 65 33 Retained

8 30 50 20 33 42 19 Rejected

9 50 83 32 53 68 34 Retained

10 49 82 26 43 63 43 Retained

11 53 88 30 50 69 45 Retained

12 42 70 25 42 56 29 Retained

13 46 77 19 32 54 45 Retained

14 34 57 20 33 45 25 Retained

15 52 87 38 63 75 32 Retained

16 50 83 37 62 73 25 Retained

17 46 77 32 53 65 26 Retained

18 38 63 36 60 62 02 Rejected

19 40 67 27 45 56 23 Retained

184

1 2 3 4 5 6 7 8

20 46 77 30 50 63 28 Retained

21 56 93 22 37 65 61 Retained

22 52 87 24 40 63 50 Retained

23 36 60 35 58 59 02 Rejected

24 48 80 28 47 63 37 Retained

25 32 53 20 33 43 21 Retained

26 54 90 41 68 79 32 Retained

SUB-TEST : 5 MENTAL ABILITIES

1 25 42 14 23 33 23 Retained

2 30 50 18 30 40 21 Retained

3 28 47 12 20 33 29 Retained

4 36 60 16 27 43 35 Retained

5 18 30 13 22 26 10 Rejected

6 26 43 20 33 38 11 Rejected

7 34 57 18 30 43 27 Retained

8 20 33 19 32 33 0 Rejected

9 24 40 21 35 38 06 Rejected

10 40 67 26 43 55 25 Retained

11 32 53 15 25 39 30 Retained

12 22 37 17 28 33 09 Rejected

13 38 63 19 32 48 31 Retained

14 23 38 10 17 28 28 Retained

15 31 52 15 25 38 30 Retained

16 24 40 14 23 32 21 Retained

17 31 52 10 17 34 40 Retained

18 30 50 18 30 40 21 Retained

19 29 48 16 27 38 24 Retained

20 18 30 13 22 26 10 Rejected

SECTION : 2 WORD ANALOGY

1 27 45 12 20 33 27 Retained

2 32 53 18 30 42 23 Retained

3 34 57 20 33 45 25 Retained

4 46 77 13 22 49 54 Retained

5 36 60 34 57 58 04 Rejected

6 30 50 16 27 38 26 Retained

7 38 63 34 57 60 06 Rejected

8 50 83 26 43 63 43 Retained

9 40 67 29 48 58 19 Rejected

10 53 88 22 37 63 55 Retained

11 0 0 0 0 0 0 Rejected

12 37 62 20 33 48 31 Retained

13 32 53 17 28 41 26 Retained

14 48 80 38 63 72 22 Retained

185

1 2 3 4 5 6 7 8

15 36 60 20 33 47 29 Retained

16 24 40 18 30 35 11 Rejected

17 43 72 29 48 60 26 Retained

18 49 82 20 33 58 51 Retained

19 45 75 19 32 53 42 Retained

20 35 58 21 35 47 25 Retained

21 47 78 28 47 63 34 Retained

SECTION : 3 WORD RELATION

1 40 67 22 37 52 31 Retained

2 32 53 28 47 50 06 Rejected

3 36 60 18 30 45 31 Retained

4 32 53 16 27 40 28 Retained

5 38 63 20 33 48 31 Retained

6 42 70 24 40 55 31 Retained

7 20 33 17 28 31 04 Rejected

8 37 62 23 38 50 25 Retained

9 46 77 34 57 67 22 Retained

10 50 83 36 60 72 27 Retained

11 52 87 42 70 78 22 Retained

12 36 60 32 53 57 08 Rejected

13 42 70 30 50 60 21 Retained

14 34 57 22 37 47 21 Retained

15 38 63 29 48 56 15 Rejected

16 42 70 28 47 58 25 Retained

17 39 65 21 35 50 31 Retained

18 44 73 20 33 53 40 Retained

19 32 53 30 50 52 02 Rejected

20 46 77 34 57 67 22 Retained

21 33 55 20 33 44 23 Retained

22 39 65 32 53 59 13 Rejected

23 45 75 30 50 63 26 Retained

24 34 57 28 47 52 08 Rejected

Distribution of grouping of the items of the pre-pilot form of

the test in relation to their difficulty indices, as shown in the above

mentioned table 4.2, it can be well summarized and read from the

comprehensive table 4.2 (A) given below :

186

TABLE 4.2 (A)

Comprehensive Distribution of Items of the Pre-Pilot test

according to their Difficulty Indices

on the Lines of Summer

DIFFICULTY

INDICES

TOTAL NO. OF

ITEMS

TOTAL NO. OF

ITEMS REJECTED

TOTAL NO. OF

ITEMS RETAINED

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

0.00 to 0.40 31 16.93 13 61.90 18 85.71

0.41 to 0.60 96 52.45 22 22.91 74 77.08

0.61 to 1.00 56 30.60 2 3.57 54 96.42

TOTAL 183 100.00 37 20.21 146 79.78

It can be noted from the Table 4.2 (A) that according to

Summer’s distribution6, there should be 20, 60 and 20 percent of

items of 0.00 to 0.40, 0.41 to 0.60 and 0.61 to 1.00 difficulty

indices respectively. Analysis of the present test indicated that

there were 17, 52 and 30 percent items respectively in the said

range in reality. This picture is in fact too nearer to Summer’s

distribution.

TABLE 4.2 (B)



on the Lines of Garret

DIFFICULTY

INDICES TOTAL NO. OF ITEMS

TOTAL NO. OF

ITEMS REJECTED

TOTAL NO. OF ITEMS

RETAINED

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

0.00 to 0.25 01 0.54 01 100.00 00 00

0.26 to 0.75 172 93.98 36 20.93 136 79.06

0.76 to 1.00 10 5.46 00 00 10 100

TOTAL 183 100.00 37 20.21 146 79.78

As per Garret’s distribution7, out of the total 183 test items of the

Pre-Pilot Test as many as 46, 92 and 46 items making 25, 50 and

187

25 percentage respectively of the total should fall in the range of

0.00 to 0.25, 0.26 to 0.75 and 0.76 to 1.00 difficulty indices

respectively. As per the present Pre-Pilot Test, there have been 01,

172 and 10 items making 100, 20.93 and 0 percentages of the total

183 items falling in the range of 0.00 to 0.25, 0.26 to 0.75 and 0.76

to 1.00 difficulty indices respectively. The data indicate an obvious

contrast with Garrett’s distribution and comparatively near to

Summer’s distribution. The reason of this contrast lies in the

selection or rejection of test items. The items of the Pre-Pilot Test

of the test under report have been rejected or retained for the pilot

form of the test not on the basis of their difficulty indices but on the

basis of their discriminating values.

Importance and methods of calculating discriminative index

of items have been discussed in earlier pages. It is also to be noted

that for the present research, items having the discriminating index

at 0.20 or more have been selected for the pilot form of the test.

Item analysis of the Pre-Pilot Test of the test shows the

discriminating index of the test items as given in Table 4.2. The

results can be read more comprehensively from Table 4.3.

TABLE 4.3

Discriminative Value of the Pre-Pilot Test Items:

A Comprehensive View

DISCRIMINATING INDEX NO. OF ITEMS % OF ITEMS

0.19 and below 37 20.21

0.20 to 0.30 76 41.53

0.31 to 0.40 42 22.95

0.41 to 0.50 12 06.55

0.51 to 0.60 10 05.46

0.61 to 0.70 05 02.73

0.71 to 0.80 01 0.54

TOTAL : 183 100.00

188

It can be seen from Table 4.3 that out of the total 183 test

items, of the Pre-Pilot Test, as many as 37 i.e. 20 percentage items

show 0.19 and below discriminative value. Naturally, these items

have been rejected in the Pilot Form of the test. This means that

there were 183 test items in the Pre-Pilot Test of the test and out of

them total 146 items have been selected for the Pilot Form of the

test. It should be viewed also from the point of view of the sub-

tests. The whole test consisted of 5 sub-tests; the 5th

one is in 3

sections. Table 4.4 shows the Sub-test wise Distribution of the Test

Items Selected for the Pilot Form on the basis of their

Discriminative Power.

TABLE 4.4

Sub-test wise Distribution of the Test Items Selected for the Pilot

Form on the basis of their Discriminative Power

SUB

TEST

NO. OF ITEMS

IN PRE-PILOT

TEST

ITEMS BELOW THE

DISCRIMINATIVE VALUE

0.20

ITEMS RETAINED FOR

PILOT FORM

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

I 26 03 11.53 23 88.46

II 39 06 15.38 33 84.61

III 27 04 14.81 23 85.18

IV 26 04 15.38 22 84.61

V 65 20 30.76 45 69.23

TOTAL 183 37 20.21 146 79.78

Table 4.4 indicates that out of total 26, 39, 27, 26 and 65 test

items respectively in sub-tests I, II, III, IV, and V only 23, 33, 23,

22 and 45 test items have been selected for the subsequent Pilot

Test. The rejection of only 20 percentages of the total items

represents well about the careful construction of the test in its basic

Pre-Pilot Test.

189

4.2.2 Pilot Test :

The pilot form of the test was constructed as per the Pre-Pilot Test.

4.2.2.1 Selection of the test Items :

Construction of the test items were already scrutinised at the

Pre-Pilot stage. Items found to be ambiguous were modified and

those items showing discriminating indices below 0.20 were

rejected. This gave the final form to the items of the Pilot test.

4.2.2.2 Sample :

The Pilot Form of the test was administered on a total

sample of 500 trainees of B. Ed. Colleges of Gujarat. Details

regarding the number of participants and institutions selected for

the Pilot Form were shown in Table 4.5 as follows:

TABLE 4.5

College & Sex-wise Distribution of the sample for Pilot Test

Sr.

No Name of College Male Female Total

1 Lalitaba Edu. Trust Sanchit B.Ed. College,

Modasa. 25 25 50

2 Shree G. H. Patel College of Education, Patan. 25 25 50

3 M. B. Patel College of Education , Sardar Patel

University, Vallabh Vidyanagar .

50 50 100

4 Shree Vestabhai H. Patel College of B.Ed. (Girls),

Dharampur, Dist. Valsad

25 25 50

5 Smt. S.B. Gardi B.Ed. College, Kharva Road, Opp.

Ramroti Ashram At. Dhrol. Dist. Jamnagar. 25 25 50

6 Christian College of Education I. P. Mission

Compound, Anand

25 25 50

7 College of Education, Shree Satyanarayan Temple

Campus, Kathiria, Nani Daman 25 25 50

Total: 200 200 400

190

From table 4.5 it can be seen that out of the administered

sample of 500, only 400 respondents were found to be proper as

per the instructions and rest of others were discarded either due to

incomplete answer sheets or due to improper indication of answers

in the answer sheets. The pilot test was thus administered on a

sample of 400 trainees.

4.2.2.3 Administration of Pilot Test

It is needless to mention that personal presence of the

researcher at the time of administration of the pilot test helped a lot

in establishing rapport with the respondents to know the

approximate time required for the Final Form of the test as well as

to have clarifications regarding the instructions and the test items.

It was found necessary to specify that all items were not in one and

the same order viz. addition, subtractions etc., but each item had a

different order. No supplementary instructions or specifications

were required in any other sub-tests. An overall check-up was

needed at the end to see if any test item was left unanswered by any

respondent. There were instances when some trainees were

required to mark their answers to the test items which they had left

unanswered. Indeed, such instances were only casual.

4.2.2.4 Element of Time

This was not a speed test. Hence, the respondents could take

as much time as they needed to answer each items of the test. The

overall time estimated in taking the entire pilot test was around 95

191

minutes. It was thought that one item would take about 45 seconds

in general, but the items of the sub-test V were having the element

of mathematical calculations, would naturally take some more

time. Hence, the sub-test V would take 15 minutes more in addition

to the total time required. Of course, this was only for the

consideration of the researcher and the respondents were not

informed in this regard. But practical administration of the Pilot

Form of the test took 90 minutes.

4.2.2.5 Scoring Scheme

The test items for the Pilot Form of the test were drawn from

the Pre Tyr-out Form of the test with necessary modifications in

construction of the statements. But there was no change in the form

of the test and sub-tests. There was a minor change in the form of

order of statements and increase/decrease of negative statements.

Hence, the scoring scheme for this Pilot Form of the test remained

the same as that of the Pre-Pilot Test. The total score of the Pilot

test accordingly was 146.

4.2.2.6 Analysis and Interpretation of Results

The answer sheets were assessed according to the scoring

scheme after the administration of the Pilot Form of the test. The

total answer sheets were rearranged in ascending order as 1 to 400.

The one who got the highest score was numbered at 1 and

respectively all others were numbered as per their score and the

lowest score was numbered 400 (the last). The higher and the lower

192

groups were formed as per the scheme mentioned in Pre-Pilot Form

of the test.

The higher group consisted of 27% (i.e.108) highest

achieved score and the lowest group consisted of 27% (i.e.108)

lowest achieved score. This means the answer sheet no. 1 to 108

formed the Upper Group and that answer sheet no. 292 to 400

formed the Lower Group. The number of those respondents who

answered each item correctly was found from each of the two

groups. This is shown in Table 4.6. There were 108 respondents in

each of the two groups. The percentage of the Upper Group (Rh)

and the Lower Group (R1) for each item had also been shown in

the Table 4.6.

Discriminative value of each statement was calculated by

using the Flanagan Table as given in the book: Statistical Inference

Helen M. Walker and Joseph Lev (1965)8. Difficulty index of each

statement was drawn by using the procedure as discussed earlier

(4.2.2.7) of this chapter. Table 4.6 shows group-wise number and

percentage of correct responses, difficulty index and discriminative

value of each item of the Pilot Form of the Test. Items retained or

rejected is also shown in the last column of the table.

193

TABLE 4.6

No. of Respondents from Upper and Lower Group answering each

item correctly. The Difficulty Index and Discriminative

Value of the Pilot Test

ITEM

NO.

(Rh)

UPPER GROUP

NO. OF

CORRECT

RESPONCES

% OF

CORRECT

RESPONCES

R1

LOWER

GROUP NO.

OF CORRECT

RESPONCES

% OF

CORRECT

RESP0NCES

DIF

FIC

ULTY

IND

EX

DIS

CR

IMIN

ATIV

E

VA

LU

E

REMARKS

1 2 3 4 5 6 7 8 SUB TEST : 1 INNOVATION-RESEARCH IN EDUCATION AND INTEREST AND ATTITUDE TOWARDS TEACHING

1 89 82 50 46 64 39 Retained

2 74 69 45 42 55 27 Retained

3 68 63 55 51 57 13 Rejected

4 60 56 36 36 46 21 Retained

5 79 73 46 43 58 31 Retained

6 92 85 38 35 60 52 Retained

7 82 76 52 48 62 30 Retained

8 80 74 62 57 66 20 Retained

9 96 89 39 36 63 55 Retained

10 68 63 54 50 56 13 Rejected

11 81 75 60 56 65 20 Retained

12 75 69 50 46 58 23 Retained

13 78 72 48 44 58 29 Retained

14 60 56 36 33 44 25 Retained

15 94 87 48 44 66 47 Retained

16 77 71 44 41 56 31 Retained

17 97 90 51 47 69 51 Retained

18 76 70 54 50 60 21 Retained

19 64 59 40 37 48 22 Retained

20 98 91 52 48 69 50 Retained

21 68 63 46 43 53 20 Retained

22 76 70 42 39 55 33 Retained

23 83 77 65 60 69 18 Rejected

SUB-TEST: 2 TEACHER’S MASTARY IN SUBJECT-CONTENT, TEACHING METHODS AND EVALUATION IN

EDUCATION

1 83 77 46 43 60 36 Retained

2 65 60 39 36 48 25 Retained

3 73 68 53 49 58 21 Retained

4 92 85 45 42 63 45 Retained

5 74 69 50 46 57 23 Retained

194

1 2 3 4 5 6 7 8

6 65 60 66 61 61 13 Rejected

7 79 73 42 39 56 22 Retained

8 87 81 48 44 63 39 Retained

9 95 88 59 55 71 41 Retained

10 82 76 64 59 68 20 Retained

11 62 57 38 35 46 23 Retained

12 66 61 42 39 50 22 Retained

13 84 78 52 48 63 33 Retained

14 93 86 66 61 74 33 Retained

15 70 65 46 43 54 22 Retained

16 94 87 44 41 64 50 Retained

17 98 91 60 56 73 43 Retained

18 87 81 68 63 72 22 Retained

19 77 71 51 47 59 25 Retained

20 97 90 43 40 65 56 Retained

21 69 64 50 46 55 19 Rejected

22 47 44 48 44 44 0 Rejected

23 66 61 40 37 49 25 Retained

24 60 56 36 33 44 25 Retained

25 72 67 50 46 56 21 Retained

26 71 66 46 43 54 25 Retained

27 64 59 51 47 53 12 Rejected

28 80 74 53 49 62 28 Retained

29 75 69 62 57 63 15 Rejected

30 93 86 55 51 69 42 Retained

31 88 81 67 62 72 22 Retained

32 76 70 54 50 60 21 Retained

33 70 65 50 46 56 19 Rejected

SUB TEST : 3 TEACHER'S COMMITMENT

1 104 96 70 65 81 51 Retained

2 97 90 86 80 85 18 Rejected

3 84 78 50 46 62 34 Retained

4 98 91 68 63 77 38 Retained

5 88 81 44 41 61 42 Retained

6 86 80 72 67 73 17 Rejected

7 89 82 72 67 75 20 Retained

8 74 69 62 57 63 13 Rejected

9 79 73 52 48 61 26 Retained

10 85 79 51 47 63 34 Retained

11 78 72 54 50 61 23 Retained

12 80 74 57 53 63 24 Retained

13 75 69 48 44 57 25 Retained

14 79 73 46 43 58 31 Retained

195

1 2 3 4 5 6 7 8

15 85 79 52 48 63 33 Retained

16 93 86 70 65 75 29 Retained

17 97 90 55 51 70 48 Retained

18 95 88 38 35 62 57 Retained

19 92 85 49 45 65 44 Retained

20 90 83 64 59 71 28 Retained

21 89 82 71 66 74 20 Retained

22 85 79 53 49 64 33 Retained

23 81 75 60 56 65 20 Retained

SUB-TEST : 4 TEACHER'S HUMAN RELATIONSHIPS AND SOCIAL DEDICATION

1 90 83 66 61 72 27 Retained

2 80 74 45 42 58 33 Retained

3 99 92 68 63 77 42 Retained

4 93 86 48 44 65 47 Retained

5 74 69 54 50 59 19 Rejected

6 84 78 60 56 67 25 Retained

7 74 69 61 56 63 13 Rejected

8 95 88 64 59 74 38 Retained

9 84 78 70 65 71 17 Rejected

10 86 80 38 35 57 47 Retained

11 92 85 52 48 67 40 Retained

12 94 87 56 52 69 40 Retained

13 84 78 50 46 62 34 Retained

14 96 89 59 55 72 41 Retained

15 80 74 55 51 63 26 Retained

16 68 63 64 59 61 4 Rejected

17 70 65 47 44 54 21 Retained

18 82 76 56 52 64 26 Retained

19 68 63 46 43 53 20 Retained

20 62 57 40 37 47 21 Retained

21 74 69 53 49 59 21 Retained

22 76 70 54 50 60 21 Retained

SUB TEST : 5 MENTAL ABILITIES

SECTION : 1 NUMBER SERIES

1 58 54 38 35 44 21 Retained

2 60 56 40 37 46 21 Retained

3 59 55 35 32 44 23 Retained

4 74 69 41 38 53 31 Retained

5 68 63 37 34 49 29 Retained

6 80 74 46 43 58 33 Retained

7 48 44 42 39 42 6 Rejected

8 47 44 30 28 36 17 Rejected

9 73 68 38 35 51 35 Retained

196

1 2 3 4 5 6 7 8

10 62 57 45 42 50 14 Rejected

11 64 59 33 31 45 29 Retained

12 64 59 25 23 41 38 Retained

13 68 63 39 36 50 27 Retained

14 58 54 32 30 42 25 Retained

SECTION : 2 WORD ANALOGY

1 54 50 29 27 38 26 Retained

2 68 63 36 33 48 31 Retained

3 84 78 40 37 57 43 Retained

4 78 72 59 55 63 19 Rejected

5 60 56 34 31 44 27 Retained

6 94 87 52 48 68 43 Retained

7 98 91 48 44 68 53 Retained

8 70 65 54 50 57 15 Rejected

9 82 76 51 47 62 32 Retained

10 78 72 50 46 59 27 Retained

11 66 61 44 41 51 21 Retained

12 80 74 63 58 66 18 Rejected

13 76 70 57 53 62 19 Rejected

14 90 83 60 56 69 30 Retained

15 70 65 46 43 54 22 Retained

SECTION : 3 WORD RELATION

1 80 74 44 41 57 35 Retained

2 70 65 51 47 56 19 Rejected

3 68 63 32 30 46 33 Retained

4 88 81 46 43 62 40 Retained

5 84 78 50 46 62 34 Retained

6 67 62 45 42 52 20 Retained

7 82 76 65 60 68 18 Rejected

8 98 91 70 65 78 36 Retained

9 94 87 60 56 71 36 Retained

10 86 80 48 44 62 39 Retained

11 70 65 46 43 54 22 Retained

12 84 78 40 37 57 43 Retained

13 74 69 54 50 59 19 Rejected

14 81 75 52 48 62 28 Retained

15 75 69 55 51 60 19 Rejected

16 64 59 43 40 50 18 Rejected

The data of the Table 4.6 has been comprehensively shown

and interpreted. The items have been grouped on the basis of their

197

difficulty indices. Difficulty indices of each item indicated in the

Table 4.6 can be summarised in the Table 4.6 (A) given below,

where the items are grouped according to the scheme and

distribution of Summer and Garrett respectively.

TABLE 4.6 (A)

Comprehensive Distribution of Items of the Pilot test


on the Lines of Summer

DIFFICULTY

INDICES

TOTAL NO. OF ITEMS TOTAL NO. OF ITEMS

REJECTED

TOTAL NO. OF

ITEMS RETAINED

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

0.20 to 0.40 03 02.05 01 03.57 02 01.69

0.41 to 0.60 71 48.63 14 50.00 57 48.30

0.61 to 1.00 72 49.31 13 46.42 59 50.00

TOTAL 146 100.00 28 19.17 118 80.82

As per Summer’s distribution, there should be 20, 60 and 20

percentage of items of 0.20 to 0.40, 0.41 to 0.60 and 0.61 to 1.00

difficulty indices respectively. As per the present pilot test there

should be 29, 88 and 29 items out of total 146 test items

respectively in the range of 0.20 to 0.40, 0.41 to 0.60 and 0.61 to

1.00 difficulty indices. Analysis of the present pilot test indicated

that there were 2, 48 and 49 percentage items respectively in the

said range in reality. This result was in fact somewhat different

from Summer’s distribution.

198

TABLE 4.6 (B)



on the Lines of Garret

DIFFICULTY

INDICES

TOTAL NO. OF

ITEMS

TOTAL NO. OF

ITEMS REJECTED

TOTAL NO. OF

ITEMS RETAINED

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

0.00 to 0.25 00 00 00 00 00 00

0.26 to 0.75 141 96.57 27 96.42 114 96.61

0.76 to 1.00 05 03.42 01 03.57 04 03.38

TOTAL 146 100.00 28 19.17 118 80.82

As per Garrett’s distribution, out of the total 146 test items

of the pilot test there should be 36, 73 and 36 items making 25, 50

and 25 percentage test items in the range of 0.00 to 0.25, 0.26 to

0.75 and 0.76 to 1.00 difficulty indices respectively. As per result

of the analysis of the present pilot test, there were 00, 141 and 05

items making 00, 96 and 03 percentage of the total 146 items

falling in the range of 0.00 to 0.25, 0.26 to 0.75 and 0.76 to 1.00

difficulty indices respectively. The data indicate an obvious

contrast with Summer’s distribution and comparatively near to

Garrett’s distribution.

Discriminative indices formed the base of selection of test

items for the final form of the test. This should not amount to

ignoring the difficulty indices. Care was taken to see that the

difficulty indices of the items of the pilot test remained nearer to

the distribution shown by Garret while selecting the items for the

199

Final form. The total 118 test items included in the Final form of

the test show the difficulty indices as follows:

TABLE 4.7

Difficulty Indices for Final Test

DIFFICULTY

INDICES

NO. OF ITEMS IN

FINAL TEST

% OF ITEMS IN

FINAL TEST

0.20 to 0.40 02 01.69

0.41 to 0. 60 86 72.88

0.61 to 1.00 30 25.42

TOTAL 118 99.99

As said earlier, items from the pilot test were selected for the

Final form of the test not on the base of their difficulty value but on

the base of their discriminative indices. It can be seen from Table

4.7 that items having the discriminating index at 0.20 or more have

been selected for the final form of the test. A comprehensive view

of the discriminative value can be seen in table 4.8 as follows:

TABLE 4.8

Discriminative Value of the Test Items:

A Comprehensive View

DISCRIMINATING INDEX NO. OF ITEMS % OF ITEMS

0.19 and below 28 19.17

0.20 to 0.30 62 42.46

0.31 to 0.40 32 21.91

0.41 to 0.50 17 11.64

0.51 to 0.60 07 04.79

0.61 to 0.70 00 00

0.71 to 0.80 00 00

TOTAL : 146 100.00

200

It can be noted from table 4.8 that items having 0.19 or

below discriminative value were rejected for items inclusion in the

final test just as it was done for the Pre-Pilot Test. It can be seen

from Table 4.8 that out of the total 146 test items, of the pilot form,

as many as 28 i.e. 19 percentage items show 0.19 and below

discriminative value. Naturally, these items have been rejected in

the Final Form of the test. This means that there were 146 test

items in the Pilot Form of the test and out of them total 118 items

have been selected for the Final Form of the test. Hence, there is no

doubt about the careful construction of the pilot test.

It should be viewed also from the point of view of the sub-

tests. The whole test consisted of 5 sub-tests; the 5th

one is in 3

sections as it was in Pre-Pilot form. Table 4.9 shows the sub-test

wise total items and discriminative values.

TABLE 4.9

Sub-test wise Distribution of the Test Items Selected for the Pilot

Form on the basis of their Discriminative Power

SUB

TEST

NO. OF ITEMS

IN PRE-PILOT

TEST

ITEMS BELOW THE

DISCRIMINATIVE

VALUE 0.20

ITEMS RETAINED FOR

FINAL FORM

NO. OF

ITEMS

% OF

ITEMS

NO. OF

ITEMS

% OF

ITEMS

I 23 03 10.71 20 16.94

II 33 06 21.42 27 22.88

III 23 03 10.71 20 16.94

IV 22 04 14.28 18 15.25

V 45 12 42.85 33 27.96

TOTAL 146 28 19.17 118 80.82

201

Table 4.9 indicates that out of total 23, 33, 23, 22 and 45 test

items respectively in sub-tests I, II, III, IV, and V of Pilot test, only

20, 27, 20, 18 and 33 test items have been selected for the

subsequent Final Form of the Test.

4.2.3 Final Test :

This section is written on the discussion regarding the final form of

the test.

4.2.3.1 Construction of Final Test

Final test was constructed on the basis of analysis of the

results of the pilot form. This final form was also prepared in five

sub-tests.

As against 23, 33, 23, 22 and 45 test items making the total

146 respectively in sub-tests I, II, III, IV, and V of Pilot test, only

20, 27, 20, 18 and 33 (total 118) test items were selected

respectively in five sub-tests of the Final Form.

Table 4.10 shows distribution of test items in the final test

form from the point of view of their discriminative values.

TABLE 4.10

Distribution of Test Items of the Final Test as per

Discriminative Value

DISCRIMINATIVE VALUE

NO. OF TEST ITEMS

TOTAL SUB TESTS

I II III IV V

0.51 to 0.55 1 1 1 1 1 5

0.46 to 0.50 1 1 1 1 2 6

0.41 to 0.45 3 0 2 1 0 6

0.36 to 0.40 0 6 2 2 6 16

0.31 to 0.35 4 2 3 4 6 19

0.26 to 0.30 7 3 6 5 5 26

0.21 to 0.25 4 14 5 4 13 40

TOTAL 20 27 20 18 33 118

202

Table 4.10 shows that no test item in any sub-test has shown

discriminative value above 0.55. Only 1 item in each sub-test can

be seen in the range of 0.51 to 0.55. Similarly, only 1 test item can

be seen in the range of 0.46 to 0.50, except in sub-test- V, where 2

test items can be found. The the maximum number of test items can

be found in the range of 0.36 to 0.40, 0.31 to 0.35, and 0.26 to 0.30,

where there were 16, 19 and 26 items respectively in all five sub-

tests. The highest no. of items total 40 test items can be seen in the

lowest range – 0.21 to 0.25.

4.2.3.2 Final Test and Answer sheet

The researcher had predetermined to get the Final test

printed in press to make it more user-friendly, attractive and

legible. The size of the test booklet was selected referring different

sizes for test booklets. Finally, the matter of each five sub-tests was

handed over in a press with necessary instructions. It was decided

to attach with this five sub-tests, two more ready-made tests used

by the researcher – 1: Emotional Intelligence test by Dr. Pallavi

Patel and Hitesh Patel and 2: Adopted version of Gardner’s

Intelligence test. Total one thousand test booklets and answer

sheets were prepared for the final data collection. A copy of the

Final test booklet with answer key is put-up in Appendix – V for

its origin.

General instructions for respondents have been printed on

the cover page. Instructions along with illustrations for each sub-

test have been given in the beginning of the concerned sub-test.

The respondents had fill-up preliminary information at the top of

the answer sheet before they fill-up their answers on the answer

203

sheet. Each section of the test was paralleled with the answer sheet.

Each item was presented in serial order with number of all the

alternative responses. Order of distracters was also changed in

some of the test items and correct choices were placed at random

with a view to avoid any sequential order in answers. It was clear

that no change in wording was made in any item; only the

distracters were changed as per the need. The respondents had to

fill-up their correct choice by encircling the right option.

4.2.3.3 Sample for the Final Test

Analysis shows that the sample for final form of the test has

been drawn to make the same ‘a classified cluster sample’. A fairly

good number of 35 B. Ed colleges of Gujarat were consulted for

the final administration of the test in 2010-11. As a result of this

fairly a large number of 1000 respondents could be found with

correct responses from 17 B. Ed. Colleges throughout all the four

regions of Gujarat – South Gujarat, Middle Gujarat, Saurashtra and

North Gujarat. Six universities of the state have been taken as

different stratum and 17 colleges of these universities have been

considered as clusters. Total number of sample drawn on this basis

comes to 1000 B. Ed. Students of 325 male and 675 female ones,

as represented in details in Table 3.2 in the previous chapter – 3.

4.2.3.4 Administration of the Final Test

The researcher had administered the test personally at all

places with necessary rapport with respondents and with all

precautions. The final test was administered in the months of

204

December, January and February in the year 2010-11 with the prior

contacts and permission of the principals of the respective colleges.

4.2.3.5 Time Limit

Before giving a test to a large sample for establishing the

norms, it is essential to fix appropriate time limit for answering the

test. The time limit to be fixed largely depends upon the purpose of

the test. In case of power test, the time limit is fixed in such a way

as almost all individuals have opportunity to consider all the items

of the test.

In pilot test, about 40 to 50 seconds were required for one

item and total 120 minutes were calculated for the whole test. But

practical administration of the test took only 95 minutes. The time

limit is generally decided by considering the record of the time

taken by different individuals at the time of the pilot test. This can

give approximate time. In order to decide the exact time to be

allowed to answer the questions, some definite criteria have to be

fixed. There are different views about the time to be allowed to

answer the test. This becomes clear from the following statement

of Ross.9

“Lindquist suggests that in general achievement tests, the

time allowance should be so adjusted that 75 percent of the pupils

will have time at least to consider all items in each section. Ruth

seemed to favour time limits so that 90% can attempt all items

within their power.”

205

From this it was decided that the time allowance be fixed in

such a way that 75% of B. Ed. Students will have time at least to

consider all items. The time estimated in taking the entire final test

was computed to be 120 minutes. There were 118 test items in the

final form and that would take 80 minutes at the usual rate of half a

minute per item as experienced in pilot administration. Adding 05

minutes for distribution of the test booklet and answer sheets, 05

minutes to collect the test booklets and answer sheets and 10

minutes more for additional two tests, in all the total time required

would be 100 minutes. The respondents took 90 minutes on

average for responding to the whole test.

4.2.3.6 Scoring the Test

After administering the test, the next huge task was that of

scoring. The same scoring method, as was followed in the pilot

test. The same was followed here also. No change was necessary in

the scoring pattern. The scoring pattern, thus, is a standardised one

for scoring the present test.

No help was taken from any person for scoring. So there was

no question of any scoring errors being committed. The researcher

personally assessed all the 1000 answer sheets. It took almost 1

month to compute the score.

The the maximum score that a respondent can obtain on this

test is 118 and the the minimum score that can be obtained is

naturally zero. The highest score obtained on the test was 94 and

the lowest score was 38.5.

206

o SETTING-UP DIRECTIONS FOR ADMINISTRSTION AND

SCORING; THE ESTABLISHMENT OF NORMS

4.3 Establishment of Norms

Before establish the norms for the test, it was first essential to

prepare a frequency distribution table of teaching aptitude scores. Table

4.11 shows frequency distribution of scores made by male and female

respondents taken together with their measures of central tendency and

variability.

TABLE 4.11

Frequency Distribution of Teaching Aptitude Scores of Male and

Female respondents with their Mean, Median, S.D. and

other Measures of Variability

CLASS

INTERVAL MALE FEMALE TOTAL

F CF F CF F CF

91-95 1 325 9 675 10 1000

86-90 11 317 43 666 54 983

81-85 46 276 98 623 144 899

76-80 81 192 147 525 228 717

71-75 64 133 146 378 210 511

66-70 70 58 118 232 188 290

61-65 26 28 69 114 95 142

56-60 16 13 27 45 43 58

51-55 8 5 10 18 18 23

46-50 2 2 6 8 8 10

41-45 0 1 1 2 1 3

36-40 0 0 1 1 1 1

N 325 675 1000

Mean 73.10 73.73 73.52

S. D. 8.22 8.63 8.50

Mdn 73.50 74.25 74.00

Q 5.38 5.63 5.50

Q3 78.50 79.50 79.00

Q1 67.75 68.25 68.00

207

From table 4.11, it indicates that there is difference in the mean

performances of male and female candidates. Therefore, before

establishing sex norms, percentile and standard score, it was thought to

test the significance of mean differences. T-test technique was used for

this. The summary of t-ratio is put up in table 4.12 as follows;

4.3.1 Difference between the Mean Teaching Aptitude Scores of

Male and Female B.Ed. Students:

The t-values were calculated to check the objective no. 8 and the

null hypothesis no. 12 of this study. The values are tabulated in Table

4.12.

Table 4.12

The Mean, SD and t-values of Teaching Aptitude

Scores of Male and Female

Statistics →

Sample ↓

N Mean SD Mean

Difference

t -

value

Sig.

level

Male 325 73.10 8.22

0.63 1.13* N.S.

Female 675 73.73 8.63

* Not Significant at 0.05 level

It is observed from the table 4.12 that mean difference in

performance between male and female B. Ed. Students comes to be 0.63

and the t-value is 1.13. The Table values should be t0.05 = 1.96 and t0.01 =

2.58 with df = 1000 as per Table C. Whereas, the present t-value is 1.13,

which does not exceeds the table value of ‘t’ at 0.05 level of the

significance. Hence, the mean difference between the two groups is not

significant. Consequently, the hypothesis “H12 : There is no significant

208

difference between the mean Teaching Aptitude Score of Male and

Female B.Ed. Students” is accepted. That means the mean significant

difference between the teaching aptitude score of male and female B.Ed.

students is accidental and not the real one. Hence, the mean teaching

aptitude scores of male and female students are assumed to be

homogenous.

The Mean teaching aptitude scores of male and female are

presented in the Graph 4.1

Graph 4.1

The Mean Teaching Aptitude Scores of Male and Female

From graph – 4.1 it can be clearly seen that there is a little (not

Significant) difference in mean teaching aptitude scores of male and

female B.Ed. students. The difference is accidental and not the real one.

Hence, it can be said that there is no gender difference in teaching

aptitude scores.

Therefore, the investigator has not presented the “Sex Norms” of

B.Ed. students, as they are assumed to be homogenous.

209

Ordinarily, frequency distribution conveys a picture of a situation

only in numbers. However, for quick and easy grasp, a pictorial

presentation would not be out of place. Pictorial presentation of frequency

distribution is shown by graphs. This pictorial presentation is also useful

in comparing the results with the normal curve. The frequency curve of

the whole sample is presented in the following graph 4.2

GRAPH : 4.2

HISTOGRAM, FREQUENCY CURVE

FOR WHOLE SAMPLE (N = 1000)

Graph : 4.2 shows histogram and frequency polygon on the same

axis of a whole sample of 1000 respondents. The data have been shown in

Table 4.11. Graph – 4.2 shows that frequency polygon is skewed than

smoothed frequency polygon. Distribution towards both the ends is

normal. At the mean point, the highest of the frequency polygon is more

than the smoothed one. The frequency curve of the sample for Male is

presented in the following graph 4.3

100.0090.0080.0070.0060.0050.0040.0030.00

Teaching Aptitude Test Scores

100

80

60

40

20

0

Mean =73.522

Std. Dev. =8.4981

N =1,000

Fre

qu

en

cy

210

GRAPH : 4.3

HISTOGRAM, FREQUENCY CURVE FOR MALE

The frequency curve of the sample for Female is presented in the

following graph 4.4

GRAPH : 4.4

HISTOGRAM, FREQUENCY CURVE FOR FEMALE

Graph – 4.3 and graph – 4.4 shows frequency polygon for the two

sexes. It shows negative skewness for both the sexes. Height of the

100.0090.0080.00 70.00 60.00 50.00 40.00

Teaching Aptitude Test Scores

50

40

30

20

10

0

Fre

qu

en

cy

Mean =73.0954 Std. Dev. =8.21697

N =325

100.00 90.0080.0070.0060.0050.00 40.00 30.00 Teaching Aptitude Test Scores

100

80

60

40

20

0

Fre

qu

en

cy

Mean =73.7274

Std. Dev. =8.62866 N =675

211

frequency polygon for female is higher than that of the male group. The

frequency polygon of the female is more Kurtic than that of the male.

Frequency distribution of scores by means of cumulative frequency

has been found by cumulated frequency one by one for each class

interval. Percentage of cumulated frequencies for whole sample and sex

were found. The data for cumulated frequency and percentage cumulative

frequency have been shown in Table 4.13.

TABLE 4.13

Frequency, Cumulated Frequency and Cumulated Percentage

Frequency of Teaching Aptitude test scores for the

Whole Sample and Sex

CLASS

INTERVAL

MID-

POINT

WHOLE SAMPLE MALE FEMALE

F CF % CF F CF % CF F CF % CF

91-95 93 10 1000 100.00 1 325 100.00 9 675 100.00

86-90 88 54 983 98.30 11 317 97.54 43 666 98.67

81-85 83 144 899 89.90 46 276 84.92 98 623 92.30

76-80 78 228 717 71.70 81 192 59.08 147 525 77.78

71-75 73 210 511 51.10 64 133 40.92 146 378 56.00

66-70 68 188 290 29.00 70 58 17.85 118 232 34.37

61-65 63 95 142 14.20 26 28 8.62 69 114 16.89

56-60 58 43 58 5.80 16 13 4.00 27 45 6.67

51-55 53 18 23 2.30 8 5 1.54 10 18 2.67

46-50 48 8 10 1.00 2 2 0.62 6 8 1.19

41-45 43 1 3 0.30 0 1 0.31 1 2 0.30

36-40 38 1 1 0.10 0 0 0.00 1 1 0.15

TOTAL 1000 325 675

212

From above table 4.13 the differences in cumulated percentage

frequencies between the two sexes can be seen. This was used in further

calculation of establishment of norms.

The cumulative percentage curve has been shown in Graph 4.5 as

follows :

GRAPH : 4.5

CUMULATIVE FREQUENCY GRAPH FOR WHOLE SAMPLE OF

TEACHING APTITUDE TEST

Graph – 4.5 shows the cumulative frequency curve for the whole

sample of the teaching aptitude scores. This is ‘S’ shaped curve.

Frequency, Cumulated Frequency and Cumulated Percentage frequency

of teaching aptitude scores for the whole sample and sexes have been

shown in Table 4.13.

213

A cumulative percentage curve has also been drawn for different

sexes by using the data given in Table 4.13 and presented in the graph 4.5

as follows :

GRAPH : 4.6

CUMULATIVE FREQUENCY GRAPH FOR SEXES OF

TEACHING APTITUDE TEST

Graph – 4.6 shows the cumulative frequency curve for different

sexes i.e. male and female respectively. This is also ‘S’ shaped curve.

Frequency, Cumulated Frequency and Cumulated Percentage frequency

of teaching aptitude scores for the whole sample and sexes have been

shown in Table 4.13.

From frequency smoothed frequency was calculated and presented

in the table 4.14 as follows :

214

TABLE 4.14

Frequency and Smoothed Frequency of Teaching Aptitude

test scores for the Whole Sample and Sex

CLASS

INTERVAL

MID-

POINT

WHOLE SAMPLE MALE FEMALE

F SF F SF F SF

96-100 0 3.33 0 0.33 0 3.00

91-95 93 10 21.33 1 4.00 9 17.33

86-90 88 54 69.33 11 19.33 43 50.00

81-85 83 144 142.00 46 46.00 98 96.00

76-80 78 228 194.00 81 63.67 147 130.33

71-75 73 210 208.67 64 71.67 146 137.00

66-70 68 188 164.33 70 53.33 118 111.00

61-65 63 95 108.67 26 37.33 69 71.33

56-60 58 43 52.00 16 16.67 27 35.33

51-55 53 18 23.00 8 8.67 10 14.33

46-50 48 8 9.00 2 3.33 6 5.67

41-45 43 1 3.33 0 0.67 1 2.67

36-40 38 1 0.67 0 0.00 1 0.67

31-35 0 0.33 0 0.00 0 0.33

TOTAL 1000 325 675

From above table 4.14 the differences in frequencies and smoothed

frequencies between the two sexes and the whole sample can be seen. The

smoothed frequencies were computed for the further analysis of the data.

After smoothing the frequencies smoothed frequency polygon for

the whole sample have been plotted and shown in graph 4.7 as follows :

215

GRAPH : 4.7

SMOOTHED FREQUENCY POLYGON FOR WHOLE SAMPLE

Graph – 4.7 shows that the height of the frequency polygon is more

than that of the smoothed frequency polygon. Little bit more area is

covered towards right side of the mean in the graph. However the

distribution seems to be normal as it touches at both the ends i.e. low

scores and high scores.

After smoothing the frequencies, smoothed frequency polygon for

the two sexes have been plotted and shown in graph 4.8 and graph 4.9

respectively as follows :

216

GRAPH : 4.8

SMOOTHED FREQUENCY POLYGON FOR MALE

GRAPH : 4.9

SMOOTHED FREQUENCY POLYGON FOR FEMALE

Graph – 4.8 and graph – 4.9 shows that the height of the frequency

polygon is more than that of the smoothed frequency polygon in both the

sexes. Little bit v shape can be seen in the frequency polygon for male at

217

the mean level, but after smoothing the shape was turned normal.

However, the distribution on both the sexes seems to be normal as it

touches at both the ends i.e. low scores and high scores.

In order to decide the nature of the distribution of test scores, it was

essential to find out Percentile norms of teaching aptitude scores.

The percentile norms are given in table 4.15.

Percentile ranks were computed by using the following formula10

;

PP

Where,

PP = Percentage of the distribution wanted,

l = exact lower limit of the class interval upon which PP lies,

PN = part of N to be counted off in order to reach PP ,

F = Sum of all scores upon intervals below l,

Fp = number of scores within the interval upon which PP falls,

i = length of the class interval.

218

TABLE 4.15

Percentile Norms for the Teaching

Aptitude Test Scores

PERCENTILE WHOLE

SAMPLE

SEX

MALE FEMALE

1 2 3 4

1 50.50 51.28 49.46

2 53.28 53.31 53.25

3 55.73 55.34 55.92

4 56.90 56.44 57.17

5 58.06 57.45 58.42

6 59.22 58.47 59.67

7 60.38 59.48 60.66

8 60.97 60.50 61.15

9 61.50 61.13 61.64

10 62.03 61.75 62.13

11 62.55 62.38 62.62

12 63.08 63.00 63.11

13 63.61 63.63 63.60

14 64.13 64.25 64.09

15 64.66 64.88 64.58

16 65.18 65.50 65.07

17 65.61 65.73 65.53

18 65.87 65.96 65.82

19 66.14 66.20 66.10

20 66.40 66.43 66.39

21 66.67 66.66 66.68

22 66.94 66.89 66.96

23 67.20 67.13 67.25

24 67.47 67.36 67.53

25 67.73 67.59 67.82

26 68.00 67.82 68.11

27 68.27 68.05 68.39

28 68.53 68.29 68.68

29 68.80 68.52 68.96

30 69.06 68.75 69.25

219

1 2 3 4

31 69.33 68.98 69.54

32 69.60 69.21 69.82

33 69.86 69.45 70.11

34 70.13 69.68 70.39

35 70.39 69.91 70.65

36 70.62 70.14 70.88

37 70.86 70.38 71.11

38 71.10 70.62 71.34

39 71.33 70.87 71.57

40 71.57 71.13 71.80

41 71.81 71.38 72.03

42 72.05 71.63 72.26

43 72.29 71.89 72.49

44 72.52 72.14 72.73

45 72.76 72.39 72.96

46 73.00 72.65 73.19

47 73.24 72.90 73.42

48 73.48 73.16 73.65

49 73.71 73.41 73.88

50 73.95 73.66 74.11

51 74.19 73.92 74.34

52 74.43 74.17 74.58

53 74.67 74.43 74.81

54 74.90 74.68 75.04

55 75.14 74.93 75.27

56 75.38 75.19 75.50

57 75.63 75.44 75.73

58 75.85 75.65 75.96

59 76.07 75.85 76.19

60 76.29 76.06 76.42

61 76.51 76.26 76.65

62 76.73 76.46 76.88

63 76.95 76.66 77.11

64 77.17 76.86 77.34

65 77.39 77.06 77.57

66 77.61 77.26 77.80

220

1 2 3 4

67 77.82 77.46 78.03

68 78.04 77.66 78.26

69 78.26 77.86 78.48

70 78.48 78.06 78.71

71 78.70 78.26 78.94

72 78.92 78.46 79.17

73 79.14 78.66 79.40

74 79.36 78.86 79.63

75 79.58 79.06 79.86

76 79.80 79.27 80.09

77 80.02 79.47 80.32

78 80.24 79.67 80.58

79 80.46 79.87 80.92

80 80.78 80.07 81.27

81 81.13 80.27 81.61

82 81.47 80.47 81.95

83 81.82 80.80 82.30

84 82.17 81.15 82.64

85 82.51 81.51 82.99

86 82.86 81.86 83.33

87 83.21 82.21 83.68

88 83.56 82.57 84.02

89 83.90 82.92 84.36

90 84.25 83.27 84.71

91 84.60 83.63 85.05

92 84.94 83.98 85.40

93 85.29 84.33 86.05

94 85.87 84.68 86.84

95 86.80 85.04 87.62

96 87.72 85.39 88.41

97 88.65 86.52 89.19

98 89.57 88.00 89.98

99 90.50 89.48 91.75

100 95.50 95.50 95.50

221

4.3.2 Deciding the Nature of the Distribution of Test Scores

If the test scores are distributed normally, we can assume that the

tool is satisfactory.

The following two procedures were used to study the distribution

of the aptitude test scores.

o Calculation of ‘Skewness’ and

o Calculation of ‘Kurtosis’.

o Calculation of Skewness of the distribution :

There are two different formulas for the calculation of

skewness. Skewness is calculated using both these formulas. The

formulae are as follows:11

(i) SK = … (A)

(ii) SK = … (B)

o Calculation of SK by formula ‘A’ :

Mean = 73.52

Median = 74.00 * from Table 4.11

SD = 8.50

∴ SK =

=

= -0.169

222

The value of skewness obtained indicated a little negative

skewness.

o Calculation of SK by formula ‘B’ :

P90 = 84.71

P10 = 62.13 * from Table 4.15 (FUNCTION: PERCENTILE)

P50 = 74.11

∴ SK =

= 73.42 - 74.11

= -0.69

The value of skewness obtained by this formula also

indicated a negative skewness.

o Significance of Skewness :

For calculating wether the obtained skewness is significant,

the standard error of skewness should be known.

The formula12

used for calculation of SE of Sk is given

below:

σSK = * where d = P90 – P10

= x (84.71 - 62.13)

= 0.016 x 22.58

= 0.361

223

Deviation of our measures of skewness from ‘0’ skewness is -0.81.

CR =

= -1.91

The CR (-1.91) exceeds the table value of 0.01 level of

significance. Hence, it is clear that -1.91 represents no real

deviation of this frequency distribution from normality.

o Calculation of Kurtosis of the Distribution

The following formula13

was used for the calculation of

Kurtosis:

Ku where, Q

= 5.50/22.58

= 0.244

The Kurtosis of the frequency distribution is, thus, 0.244.

The Ku value deviates by −0.019 from 0.263, the Ku value of the

normal distribution. The negative direction of the deviation

indicates that the distribution is leptokurtic.

o Significance of Kurtosis

To estimate the significance of the deviation of Ku thus

obtain from the Ku of the normal curve, the SE of Ku is calculated

by the following formula14

.

224

σKu

=0.009

And the CR here D=deviation of Ku of

the obtained distribution

= -2.11 from Ku (0.263) of normal

distribution.

The CR (-2.11) does not falls within the ± 1.96 limits which

determine the 0.05 level of significance.

Hence, it is clear that 0.244 represents no real deviation of

this frequency distribution from normality.

4.3.1 Standard Score and T score

In teaching aptitude test, generally percentile norms,

standard scores and T-score norms are established. Therefore, for

the present test percentile norms, standard scores and T scores also

have been established and reported in this chapter.

The percentile norms for male and female are presented in

Table 4.15.

(A) Standard Score Norms

The raw scores obtained on the test were converted into the

standard scores with the help of the following formula10. The shift

from raw to standard score requires a linear transformation. This

225

transformation does not change the shape of the distribution in any

way.

The formula15

for conversion is,

X1

(X-M) + M1 Where, X

1= Standard Score

X = Raw Score

M and M1 = Means of the raw score and

standard score distribution.

The raw scores on the present aptitude test are expressed as

standard scores in a distribution of M=100 and σ = 20 as well as in

a distribution of M = 50 and σ = 10. These standard scores

obtained are given in table 4.16 with their corresponding raw

scores.

(B) The T-score Norms

The procedure suggested by Garrett11

(1969) for calculating

scores has been followed in Toto. A mode of a worksheet table was

prepared for calculation of T-scores. The calculation of T-score has

been made on the basis of the data formulated in the form given

below. The following worksheet illustrate only how the T-scores

have been calculated and not the details.

TEST

SCORE

FREQUENCY CUM. F CUM. F. BELOW

SCORE +1/2 ON

GIVEN SCORE

COL.(4)

IN %

t-

SCORES

1 2 3 4 5 6

The t scores are given in Table 4.16 along with their

corresponding raw scores and standard scores.

226

TABLE 4.16

Raw Scores and their Corresponding

Standard Scores and t Scores

RAW SCORES STANDARD

SCORES M=100

SD=20

STANDARD

SCORES M=50

SD=10

t – SCORES

1 2 3 4

1 -81 -41 -

2 -79 -39 -

3 -76 -38 -

4 -74 -37 -

5 -71 -36 -

6 -69 -34 -

7 -66 -33 -

8 -64 -32 -

9 -61 -31 -

10 -59 -29 -

11 -56 -28 -

12 -54 -27 -

13 -51 -26 -

14 -49 -24 -

15 -46 -23 -

16 -44 -22 -

17 -41 -21 -

18 -39 -19 -

19 -36 -18 -

20 -34 -17 -

21 -31 -16 -

22 -29 -14 -

23 -26 -13 -

24 -24 -12 -

25 -21 -11 -

26 -19 -9 -

27 -16 -8 -

28 -14 -7 -

29 -11 -6 -

30 -9 -4 -

31 -6 -3 -

32 -4 -2 -

33 -1 -1 -

34 1 1 -

35 4 2 -

36 6 3 -

227

1 2 3 4

37 9 4 -

38 11 6 -

39 14 7 12

40 16 8 19

41 19 9 19

42 21 11 19

43 24 12 21

44 26 13 22

45 29 14 22

46 31 16 22

47 34 17 23

48 36 18 24

49 39 19 25

50 41 21 27

51 44 22 28

52 46 23 29

53 49 24 30

54 51 26 30

55 54 27 31

56 56 28 32

57 59 29 33

58 61 31 34

59 64 32 35

60 66 33 36

61 69 34 36

62 71 36 37

63 74 37 38

64 76 38 39

65 79 39 40

66 81 41 41

67 84 42 42

68 86 43 43

69 89 44 45

70 91 46 46

71 94 47 47

72 96 48 48

73 99 49 49

74 101 51 50

75 104 52 51

76 106 53 52

77 109 54 53

78 111 56 55

79 114 57 56

228

1 2 3 4

80 116 58 57

81 119 59 58

82 121 61 60

83 124 62 61

84 126 63 62

85 129 64 64

86 131 66 65

87 134 67 67

88 136 68 68

89 139 69 70

90 141 71 72

91 144 72 73

92 146 73 74

93 149 74 75

94 151 76 78

95 154 77 -

96 156 78 -

97 159 79 -

98 161 81 -

99 164 82 -

100 166 83 -

101 169 84 -

102 171 86 -

103 174 87 -

104 176 88 -

105 179 89 -

106 181 91 -

107 184 92 -

108 186 93 -

109 189 94 -

110 191 96 -

111 194 97 -

112 196 98 -

113 199 99 -

114 201 101 -

115 204 102 -

116 206 103 -

117 209 104 -

118 211 106 -

229

4.3.2 Percentile Rank :

The procedure followed in computing percentile rank is the reverse

of the procedure of calculating percentile. In calculating percentile, we

start with a certain percent of N, then count into distribution. The given

percent and the point reached is the required percentile. Here, we begin

with an individual score, and determine the percentage of scores which

lies below it. Percentile ranks corresponding to the raw scores have been

calculated. The procedure suggested by Garrett (1969)16

has been used for

this. This percentile rank has been shown in table 4.17.

TABLE 4.17

Percentile Rank of the Raw Scores for Teaching Aptitude Test

RAW SCORE PERCENTILE RANK RAW SCORE PERCENTILE RANK

95 99.90 65 15.65

94 99.70 64 13.75

93 99.50 63 11.85

92 99.30 62 09.95

91 99.10 61 08.05

90 98.46 60 06.67

89 97.38 59 05.81

88 96.30 58 04.95

87 95.22 57 04.09

86 94.14 56 03.23

85 92.16 55 02.62

84 89.28 54 02.26

83 86.40 53 01.90

82 83.52 52 01.54

81 80.64 51 01.18

80 76.92 50 0.92

79 72.36 49 0.76

78 67.80 48 0.60

77 63.24 47 0.44

76 58.68 46 0.28

75 54.30 45 0.19

74 50.10 44 0.17

73 45.90 43 0.15

72 41.70 42 0.13

71 37.50 41 0.11

70 33.52 40 0.09

69 29.76 39 0.07

68 26.00 38 0.05

67 22.24 37 0.03

66 18.48 36 0.01

230

o FOLLO-UP STUDIES TO DETERMINE THE PREDICTIVE

VALUE OF THE TEST BATTERY IN SELECTION AND IN

VOCATIONAL GUIDANCE

4.4 Reliability of the Test :

4.4.1 The Concept of Reliability :

No matter how carefully the test has been planned and

prepared, its merits should be established. Reliability and validity

are the essential qualities of a good test. It is, therefore, necessary

as a final check to study reliability and validity of the test.

According to Anastasi, any measuring device whatsoever must

fulfil conditions like reliability and validity if it is to be of any

service. In assessing value of a test, one should consider its

validity, reliability and usability. After estimating the norms on the

basis of the final results, it is essential for the test constructor to

obtain final evidence of reliability and validity of the test, to

establish its merits.

The concept of reliability has been defined by several in the

field. They are given below :

According to Robert Lado17

,

“If the scores of the students are stable the test is reliable,

if the scores tend to fluctuate for no apparent reason the test is

unreliable.”

Mehrens18

says,

“Reliability can be defined as the degree of consistency

between two measures of the same thing.”

231

Anastasi19

states,

“Reliability refers to the consistency of scores obtained

by the same individuals when re-examined with the same test

on different occasions,”

According to Freeman20

,

“Scores for the same individuals obtained on repeated

testing are not completely stable. Not only are there to be some

different chance determinants in operation at different time,

but it is quite normal for a human being to vary in

performance, generally within fairly narrow limit, from one

occasion to another.”

4.4.2 Methods of Establishing Reliability :

Reliability is purely a statistical concept. For establishing

reliability of the present test, the following methods have been

followed;

o Test-Retest method

o Split-half Method

o Kuder and Richardson Method

4.4.2.1 Test-Retest Method :

This Method involves,

o Obtaining repeated measures for the same individuals of

the same ability who are given the same test twice.

o Computation of correlations between the first and the

second set of scores.

232

The correlation co-efficient thus obtained indicates the

extent or magnitude of the certain factors like practice, confidence,

growth, physical facilities which play certain role on those two

different occasions of the test administration. To counteract the

effects of these variables, a fairly large sample would be needed.

Besides, the time interval should be adequate.

For the present study, reliability sample selected to apply

this method consisted of 100 B.Ed. students. They were retested

after an interval of about 5 to 6 months from the date of the first

test. The scores obtained on two different administrations of the

same test were used as two sets of scores for finding out the

correlation co-efficient. The correlation co-efficient was computed

by Pearson’s Product Moment method. On the basis of the data

furnished in the scatter diagram in Table 4.18.

TABLE 4.18

The Scatter Diagram of Scores obtained by the B.Ed. students for

Teaching Aptitude Test on two Successive

Administrations of the Test

S E C O N D A D M I N I S T R A T I O N

Class 36-

40

41-

45

46-

50

51-

55

56-

60

61-

65

66-

70

71-

75

76-

80

81-

85

86-

90

91-

95 FY

FI

RS

T

AD

MI

NI

ST

RA

TI

ON

91-95 1 1 2

86-90 1 4 5

81-85 3 4 10 17

76-80 5 5 9 19

71-75 1 2 3 19 25

66-70 1 1 5 5 5 17

61-65 1 2 3 5 11

56-60 2 2

51-55 1 1

46-50 0

41-45 1 1

36-40 0

FX 0 0 1 0 3 6 10 16 38 20 5 1 100

233

r = 0.7717 Test – test Reliability ‘r’ = 0.77

SEmean = 7.6733

SEr = 0.0407

4.4.2.2 Split-half Method :

This is the most widely used method of establishing the

reliability of the test because parallel from method as well as test-

retest method have certain limitations. To overcome the limitations

of these methods, the split-half method21

is used popularly.

This method involves splitting the whole test into two

reasonably equivalent halves. For making two equivalent halves,

usually pulling the odd numbered items for the other set of scores

is done. This method is preferred because it reasonably controls

such factors as practice, fatigue, distraction and mental set.

For this purpose, a sample of 100 B.Ed. students out of the

total sample of 1000 respondents was selected for applying ‘split-

half’ method to estimate reliability of the whole test. The scores

made by B.Ed. students on odd numbered and even numbered

items found out and were spited into two parts. The correlation

between the scores on odd and even numbered items was then

computed by using Pearson’s Product Moment method.

Table 4.19 shows the scatter diagram and split-half test

reliability of the test.

234

TABLE 4.19

The Scatter Diagram of Scores obtained by the B.Ed. students on

Odd and Even Numbered Items of Teaching Aptitude Test

S C O R E S O N E V E N N U M B E R E D I T E M S

SC

OR

ES

O

N

OD

D

NU

MB

ER

ED

I

TE

MS

CLASS 15-19 20-24 25-29 30-34 35-39 40-44 45-49 fy

45-49 0 1 1 2

40-44 18 10 28

35-39 2 7 20 11 40

30-34 4 7 11 5 27

25-29 1 1 2

20-24 1 1

15-19 0

FX 1 1 6 15 49 27 1 100

r = 0.763 Half test reliability r = 0.76

SEmean = 4.336

SEr = 0.0296

Form the half test reliability, the reliability of the whole test

was computed by using Spearman Brown’s Formula22

.

rtt = 2r1.2/1+r1.2

= (2 * 0.763) / (1+0.763)

= 1.526 / 1.763 = 0.866

235

The reliability of the whole test is high. It provides

indication that the test is quite reliable tool to measure Teaching

Aptitude of B.Ed. students.

The P.E. of the ‘r’ (0.87) was found as under,

SEmean = 4.336

P.E. ‘r’ = 0.042

4.4.2.3 Method of rational Equivalence :

This method was developed by Kuder and Richardson.

It is also known as K-R Method. This method is useful for

estimating the interval consistency or homogeneity of a test/scale.

The Kuder Richardson Formula was developed because of

dissatisfaction with split-half methods. A scale can be split into two

equal halves in great many ways and each split might yield

somewhat different estimate of rtt. The use of item statistics get

away from each basis as may arise from arbitrary splitting into

halves. Finally, the most accurate and practical formula was

developed as follows23

,

rtt = n/n-1 x σt2 - Σpq / σt

2

in which rtt = reliability coefficient of the whole test.

n = number of items in the test.

236

σt2 = the S.D. of the test scores.

P = the proportion of the group answering a test item

correctly.

Q = (1-p) = the proportion of the group answering a test

item incorrectly.

This formula is called Kuder and Richardson’s 20. Kuder

and Richardson formula 20 is applicable only to a tests in which

the items are scored by giving one point if answered correctly and

nothing if not answered correctly.

With regard to the above discussion, the formula K-R 20 was

applied for the estimation of the reliability of the present test. For

this, the 100 answer sheets were randomly selected maintaining all

the major characteristics of the whole sample.

The proportion of the group answering a test item correctly,

‘p’ was found out for each of the 118 test-items. From these values

of ‘p’ the values of corresponding ‘q’s were also found out.

In the Table No. 4.20, the value of ‘pq’ for each item is

given. The sum of all pq values is equal to 21.176.

237

TABLE 4.20

Showing ‘pq’ Values of 118 Test Items

ITEM NO. ‘PQ’ ITEM NO. ‘PQ’ ITEM NO. ‘PQ’

1 0.188 41 0.230 81 0.113

2 0.148 42 0.248 82 0.148

3 0.148 43 0.202 83 0.192

4 0.082 44 0.245 84 0.245

5 0.074 45 0.250 85 0.248

6 0.166 46 0.236 86 0.236

7 0.074 47 0.245 87 0.248

8 0.202 48 0.240 88 0.250

9 0.233 49 0.236 89 0.248

10 0.182 50 0.245 90 0.236

11 0.166 51 0.228 91 0.228

12 0.134 52 0.192 92 0.224

13 0.148 53 0.160 93 0.240

14 0.221 54 0.236 94 0.248

15 0.210 55 0.192 95 0.249

16 0.177 56 0.240 96 0.242

17 0.250 57 0.250 97 0.245

18 0.141 58 0.206 98 0.249

19 0.182 59 0.218 99 0.249

20 0.202 60 0.210 100 0.242

21 0.238 61 0.206 101 0.246

22 0.242 62 0.230 102 0.228

23 0.134 63 0.233 103 0.090

24 0.230 64 0.197 104 0.141

25 0.210 65 0.246 105 0.192

26 0.210 66 0.210 106 0.246

27 0.233 67 0.230 107 0.236

28 0.228 68 0.202 108 0.245

29 0.233 69 0.248 109 0.218

30 0.246 70 0.248 110 0.188

31 0.214 71 0.224 111 0.233

32 0.188 72 0.233 112 0.250

33 0.128 73 0.202 113 0.202

34 0.166 74 0.172 114 0.236

35 0.233 75 0.218 115 0.250

36 0.233 76 0.221 116 0.236

37 0.228 77 0.221 117 0.238

38 0.250 78 0.245 118 0.236

39 0.218 79 0.230

40 0.177 80 0.210 ‘pq’ Total : 21.176

238

After analysing the responses of the B.Ed. students, sum of

‘pq’ for the whole test was found out. This value was substituted in

the following formula:

rtt

= 1.008

= 0.81

The reliability coefficient of the present aptitude test as

measured by K-R method is, therefore, 0.81. Hence, it is concluded

that the test is highly reliable.

TABLE 4.21

A Comparison of the Reliability Co-efficient of the Present Test

with four other Aptitude Tests

METHOD

APPLIED

SHAH

‘r’

SHRIVASTAV

‘r’

PANDYA

‘r’

UPADHYAYA

‘r’

PRESENT

TEST ‘r’

Test-Retest - 0.90 0.50 0.77 0.77

Split-half 0.88 - 0.63 0.81 0.76

K-R Formula-20 0.80 - - - 0.81

The above table 4.21 shows a comparison of the reliability

co-efficient of the present test with other aptitude tests.

239

Comparatively a high reliability can be seen in the test standardized

by the researcher.

4.4.2.4 Reliability in terms of True Scores and Measurement

Errors :

(a) Reliability co-efficient as a measure of true variance :

The variance of the obtained scores can be divided into two

parts : the variance of the true scores and the variance of chance

errors.

The reliability of the present test (K-R method) is 0.81.

therefore, 81 percent of the variance of test scores is true variance

and only 19 percent error variance.

(b) Estimating true scores by way of the regression

equation and the Reliability coefficient

The regression equation24

which estimates true score is given

below :

Where, = estimated true score on the test

= obtained score on the test

= mean of test distribution (73.36)

rtt = reliability coefficient of the test (0.81)

The regression equation for estimating true score on the

present test is worked out as under :

= (0.8*94) + 7.336

= 82.54

240

The standard error of an estimated true score is given by the

following formula25

,

SE∞ =

where, = 10.67

rtt = 0.81

The SE∞, of the true score on the present test is calculated

below :

= 10.67 x 0.3

= 3.20

The 0.95 confidence interval is

X∞ ± 1.96 x 3.20, i.e. π∞ ± 6.

(c) The index of reliability

The correlation between a set of obtained scores and their

corresponding true counterparts was found by finding the index of

reliability. For this, the following formula26

was employed.

where, r1∞ = the correlation between obtained and true scores

rtt = the reliability coefficient of the test

The coefficient r1∞ is called the index of reliability.

The index of reliability for the present test is :

= 0.90

241

Thus, 0.90 is the maximum correlation which the test is

capable of yielding in the present form.

4.5 Validity of the Test :

Tests should be held suspect until worth is proved, because the

tests that are supposed to measure intelligence, teaching aptitude,

adjustment etc. may not measure those characteristics at all. Edward E.

Cureton says, “The essential question of test validity is how well a test

does the job it is employed to do. The same test may be used for several

different purposes, and its validity may be high for one, moderate for

another and low for the third. Hence, we cannot label the validity of a test

as ‘high’ or ‘moderate’ or ‘low’ except for some particular purpose.”

Therefore, before a test can be used it is necessary to make certain

that the purpose of the test is justified. This leads us to the subject of test

validation. Validation of a test score is the most essential and the crux of

the process of standardization of any test, and it is also the most important

criteria to judge whether the test is good or poor. To justify the validity of

the test, it is necessary to clear, the concept of the term ‘validity’.

4.5.1 The Concept of Validity :

Validity is an important characteristic of the test. The

validity of the test depends upon the efficiency with which it

measures what it attempts to measure. Ti is also defined as the

accuracy with the test measures what it claims to measure. The

validity and the purpose of the test are closely associated. A test is

valid when it fulfils the purpose for which it was designed. The test

designed to test intelligence should measure only intelligence and

not any other thing. Similarly a test of teaching aptitude should

measure on teaching aptitude and not any other thing such as

242

intelligence or expression. Therefore, in case of valid teaching

aptitude test, the B.Ed. Students who are more capable for teaching

profession should get more scores than those who are weak.

Freeman defined it, as an essential index, “An index of validity

shows the degree to which a test measures what it proposes to

measure when compared with accepted criteria.”27

This suggests that for validating the test, it must be

compared with some accepted standards or other criteria which are

regarded by experts as the best evidence of the traits or ability to be

measured by the test. Therefore, the selection of validation criteria

is of prime importance in the process of the test validation.

4.5.2 Methods for Determining Validity :

Fundamentally, all procedures for determining the test

validity are concerned with the relationship between performance

on the test and other independently observable facts about the

behaviour characteristic under consideration. The technique that

are employed for investigating these relationships are numerous

and have been described by various names, “The APA Technical

Recommendations (i) classified these procedures under four

categories, defined as content, predictive, concurrent and construct

validity.”28

Out of these four categories of validity the two, namely

content and construct or concept validity are describe under the

heading of rational validity by many authors. Similarly, concurrent,

predictive and congruent validity are described under the heading

of empirical or statistical validity. In these methods the validity is

estimated by means statistical techniques.

243

The validity of the present test that has been established is

predictive validity.

(A) Content Validity :

Content validity has been decided on two groups, experts

opinion and validity index. Construction and item selection of test

has been already discussed at length in chapter – 3 of this report.

As for the validity index, items showing more than 0.20 validity

index have been selected and those below that have been rejected.

Concept validity has not been taken into account. Since this

test aspires to ascertain or test no concept and since there is no

other aptitude test of this type in Gujarati, congruent validity of the

test is out of question. Concurrent validity has not taken into

account.

(B) Predictive Validity :

Coefficient of validity of a test is the coefficient of

correlation between test scores and criterion scores. External

criteria for this test have been (scores) achieved at the University

examination of the B.Ed. degree.

The B.Ed. examination is being held in two parts: Part I –

Theory (700 marks) and Part II – Practical including Annual

Lesson (500 marks). For the present study the final percentage i.e.

the total of Part I and Part II, is considered as Academic

Achievement Scores.

It is possible that all the B.Ed. Students who had taken the

under report may not have taken their University examination or,

some of them have partly passed the same. With this possibility in

244

view, 20 percent B.Ed. students having fully passed in final test

and having taken the test under report have been selected from

each instruction as validity sample.

The raw scores obtained by different respondents from

different colleges were computed into percentage before they were

used to correlate them with the test scores.

The two sets of scores were arranged in the form of a scatter

diagram as shown in Table 4.22 and the product-moment

coefficient of correlation was calculated.

TABLE 4.22

The Scatter Diagram of Scores obtained by the B.Ed. students on

Academic Achievement Scores and Teaching Aptitude Test Scores

A C A D M I C A C H I E V E M E N T S C O R E S ( X - V A R I A B L E )

class 36-

40

41-

45

46-

50

51-

55

56-

60

61-

65

66-

70

71-

75

76-

80

81-

85

86-

90

91-

95 FY

TEA

CH

ING

APTIT

UD

E T

EST S

CO

RES

(Y-V

AR

IAB

LE)

91-95 0

86-90 5 5

81-85 4 1 5

76-80 10 4 4 18

71-75 4 10 5 19

66-70 10 5 6 21

61-65 9 3 1 13

56-60 4 4 4 12

51-55 2 1 3

46-50 1 1 2

41-45 1 1

36-40 1 1

FX 0 0 0 0 1 8 29 32 20 10 0 0 100

Pearson’s Product moment ‘r’ = 0.778

P.E.r = ± 0.026

245

The Teaching Aptitude Test predicts the criterion

significantly well, because the test and Part I correlation is 0.778.

This means the present test is valid instrument for predicting

Teaching Aptitude.

4.6 Conclusion :

In this chapter the researcher has presented the process of

standardization of the tool –Teaching Aptitude Test. The Reliability and

Validity calculated for the tool standardized was high. The Norms

establishment was the second most important part of this chapter. To

standardize the test the researcher had administered the tool following the

three steps – Pre-Pilot, Pilot, and Final test. The administration and

results of Pre-Pilot Test, Pilot Test and Final Test of Teaching Aptitude

Test were also discussed in the beginning of the chapter. The analysis and

interpretation of the data as discussed in the following chapter would

support the Norms establishments for the tool standardized.

246

REFERENCES

1. Garret H.E. (1968). General Psychology (Second Edition).

New Delhi : Eurasia Publishing House (Pvt.) Ltd. Ram

Nagar. PP. 486-488

2. Anne Anastasi. (1970). Psychological Testing. New Delhi :

Macmillan Pub.

3. Guilford, J.P. (1956). Fundamental Statistics in Psychology

and Education (Fourth Edition). New York : McGraw

Hill Book Co.

4. Green J. Gerberich. (1957). Measurement and Evaluation in

Secondary School. New York : Longmans Green and

Co.P.93

5. Ibid. P.90

6. Summer, S.P. (1954). Statistics in Education. London : Bacil

Blackwell and Co.

7. Garret H.E. (1968). General Psychology (Second Edition).

New Delhi : Eurasia Publishing House (Pvt.) Ltd. Ram

Nagar.

8. Walker H.M. and Lev J. (1965). Statistical Inference. Calcutta : Oxford &

IBH Pubshing Co. PP. 472-475

9. Ross, C.C. and Stanely, J.C. (1963). Measurement in Today’s School.

New Jersey : Prantice Hall, Inc. P.122

10. Garret H.E. (1981). Statistics in Psychology and education.

(10t h

Edition). Bombay : Vakils, Feffer and Simons Ltd.

P. 100

11. Ibid, P.241

12. Ibid, P.100

13. Ibid, P.242

247

14. Ibid, PP. 312-313

15. Ibid, PP. 315-317

16. Ibid, PP. 67-68

17. Lado, Robert (1962). Language Teaching – The Construction and Use of

Foreign Language Tests : A Teacher’s Book. London : Longman

Green and Co. Ltd. P.330

18. Mehrens, W.R. and lehmann, I.J. (1969). Standardized Tests in

Education. New York : Holt, Rinhart and Winston Inc. P.32

19. Anne Anastasi, Op.Cit. P.71

20. Freeman Frank S. (1967), Theory and Practice of Psychology of

Testing. New York : Harper and Raw. P.67

21. Rulon, P.J. (1039). A simplified Procedure for Determining the

Reliability of a Test by Split-halves. Haward Education. Review.

Vol-IX.

22. Ibid, P.339

23. Ibid, P.341

24. Ibid, PP. 347-348

25. Ibid, PP. 347-348

26. Ibid, P. 349

27. Freeman, F.S. Op. Cit. P.26

28. Anne Anastasi, Op.Cit. P.135

4 STANDARDIZATION OF THE TEACHING APTITUDE...

Documents

Transcript of 4 STANDARDIZATION OF THE TEACHING APTITUDE...