A Cross-Cultural Analysis Cattell Culture Fair ...

14
355 A Cross-Cultural Analysis of the Fairness of the Cattell Culture Fair Intelligence Test Using the Rasch Model H. Johnson Nenty Cleveland Public Schools Thomas E. Dinero Kent State University Logistic models can be used to estimate item pa- rameters of a unifactor test that are free of the ex- aminee groups used. The Rasch model was used to identify items in the Cattell Culture Fair Intelli- gence Test that did not conform to this model for a group of Nigerian high school students and for a group of American students, groups believed to be different with respect to race, culture, and type of schooling. For both groups a factor analysis yielded a single factor accounting for 90% of the test’s vari- ance. Although all items conformed to the Rasch model for both groups, 13 of the 46 items had sig- nificant between score group fit in either the Amer- ican or the Nigerian sample or both. These were re- moved from further analyses. Bias was defined as a difference in the estimation of item difficulties. There were six items biased in "favor" of the American group and five in "favor" of the Nigerian group; the remaining 22 items were not identified as biased. The American group appeared to per- form better on classification of geometric forms, while the Nigerians did better on progressive ma- trices. It was suggested that the replicability of these findings be tested, especially across other types of stimuli. Studies of heredity, environment, and intelli- gence since the early 1930s have shown that en- vironmental conditions do significantly influ- ence intelligence test scores (Hunt, 1972; Jordan, 1933; Leahy, 1935; Neff, 1938; Wellman, 1934, APPLIED PSYCHOLOGICAL MEASUREMENT Vol. 5, No. 3, Summer 1981, pp. 355-368 © Copyright 1981 Applied Psychological Measurement Inc. 0146-6216/81/030355-14$1.70 1937). Any test is assumed to be a measure of a &dquo;phenotype&dquo; (Jensen, 1980) and, indeed, must be influenced by the environment to some ex- tent. Davis (1948), Davis and Havighurst (1948), and Eells, Davis, Havighurst, Herrick, and Tyler (1951), however, have maintained that all intelli- gence tests fail to measure general ability direct- ly because they discriminate between social classes, cultures, and school and home training and are influenced by motivation, emotional re- action, test sophistication, and speed. Wellman (1937), and later Hunt (1972), provided evidence to show that improvements in the early child- hood environment, such as attendance at pre- school clinic, are influential in raising and pos- sibly maintaining a child’s IQ. As a result it could be concluded that there exists a large set of factors on which IQ scores depend and from which a child’s IQ could be fairly predicted without the child ever being tested. Many of these authors also imply that many of these characteristics are inextricably confounded with intelligence so that, if indeed, there are valid group differences in intelligence, these differ- ences would never be measured directly. Prior to Cattell’s (1940) work, attempts to measure intelligence appear to have been based mostly on Spearman’s single factor theory of general ability. Cattell, on the other hand, con- cluded that second-order analyses or primary abilities result in two main factors (see also Cat- Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227 . May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Transcript of A Cross-Cultural Analysis Cattell Culture Fair ...

Page 1: A Cross-Cultural Analysis Cattell Culture Fair ...

355

A Cross-Cultural Analysis of the Fairness ofthe Cattell Culture Fair Intelligence TestUsing the Rasch ModelH. Johnson NentyCleveland Public Schools

Thomas E. DineroKent State University

Logistic models can be used to estimate item pa-rameters of a unifactor test that are free of the ex-aminee groups used. The Rasch model was used to

identify items in the Cattell Culture Fair Intelli-gence Test that did not conform to this model for a

group of Nigerian high school students and for agroup of American students, groups believed to bedifferent with respect to race, culture, and type ofschooling. For both groups a factor analysis yieldeda single factor accounting for 90% of the test’s vari-ance. Although all items conformed to the Raschmodel for both groups, 13 of the 46 items had sig-nificant between score group fit in either the Amer-ican or the Nigerian sample or both. These were re-moved from further analyses. Bias was defined as adifference in the estimation of item difficulties.There were six items biased in "favor" of theAmerican group and five in "favor" of the Nigeriangroup; the remaining 22 items were not identifiedas biased. The American group appeared to per-form better on classification of geometric forms,while the Nigerians did better on progressive ma-trices. It was suggested that the replicability ofthese findings be tested, especially across othertypes of stimuli.

Studies of heredity, environment, and intelli-gence since the early 1930s have shown that en-vironmental conditions do significantly influ-ence intelligence test scores (Hunt, 1972; Jordan,1933; Leahy, 1935; Neff, 1938; Wellman, 1934,

APPLIED PSYCHOLOGICAL MEASUREMENTVol. 5, No. 3, Summer 1981, pp. 355-368© Copyright 1981 Applied Psychological Measurement Inc.0146-6216/81/030355-14$1.70

1937). Any test is assumed to be a measure of a&dquo;phenotype&dquo; (Jensen, 1980) and, indeed, mustbe influenced by the environment to some ex-tent. Davis (1948), Davis and Havighurst (1948),and Eells, Davis, Havighurst, Herrick, and Tyler(1951), however, have maintained that all intelli-gence tests fail to measure general ability direct-ly because they discriminate between social

classes, cultures, and school and home trainingand are influenced by motivation, emotional re-action, test sophistication, and speed. Wellman(1937), and later Hunt (1972), provided evidenceto show that improvements in the early child-hood environment, such as attendance at pre-school clinic, are influential in raising and pos-sibly maintaining a child’s IQ. As a result itcould be concluded that there exists a large setof factors on which IQ scores depend and fromwhich a child’s IQ could be fairly predictedwithout the child ever being tested. Many ofthese authors also imply that many of thesecharacteristics are inextricably confounded withintelligence so that, if indeed, there are validgroup differences in intelligence, these differ-ences would never be measured directly.Prior to Cattell’s (1940) work, attempts to

measure intelligence appear to have been basedmostly on Spearman’s single factor theory ofgeneral ability. Cattell, on the other hand, con-cluded that second-order analyses or primaryabilities result in two main factors (see also Cat-

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 2: A Cross-Cultural Analysis Cattell Culture Fair ...

356

tell, 1968, p. 56): one referred to as fluid and theother crystallized intelligence factors. Fluid in-telligence is said to be the &dquo;major measurableoutcome of the influence of biological factors onintellectual development-that is heredity&dquo;(Horn & Cattell, 1966, p. 254); and crystallizedintelligence can be accounted for by the invest-ment of fluid intelligence on environmental fa-cilities such as early exposure to intellectuallystimulating experiences, school, and so forth

(Cattell, 1979, p. 5). Crystallized general intelli-gence shows itself in skills that have been ac-

quired by cultural experiences, such skills thattraditional intelligence tests in part are seen tomeasure. By differentiating between two essen-tially different factors of general ability, Cattellnot only provided a parsimonious model of anobscure construct, intelligence, but also pro-vided a basis for a more meaningful measure ofthat construct. From this, it seems that crystal-lized intelligence should be heavily dependenton culture and fluid intelligence should not; andeven though differences might exist within a cul-ture among individuals’ fluid intelligence, thereare not necessarily any differences among cul-tures.

Recognizing fluid intelligence as a distinct fac-tor, Cattell and his colleagues began to searchfor a more &dquo;saturated&dquo; measure to define it

operationally. The results turned out &dquo;almost asa by-product, to have promising properties asculture-fair tests&dquo; (Cattell, 1971, p. 82). Sincethen, &dquo;culture-fair intelligence tests ... havebecome the practical test of expression of fluidgeneral ability factor&dquo; (Cattell, 1971, p. 78).Though culture-fair tests are found to measuremostly fluid intelligence, the converse is not

true; that is, measures of fluid intelligence arefound only among noncultural performances(Cattell, 1979, p. 6). Measures of fluid intelli-gence include tests of judgments and reasoninginvolving classification, analogies, matrices, andtopology (Cattell, 1968, p. 58); and these are, in-deed, the types of stimuli that Jensen (1980) re-fers to as &dquo;culturally reduced.&dquo; Cattell’s culture-fair test was designed to &dquo;avoid relationships de-pending on anything more than immediately

given fundaments that are equally familiar orequally unfamiliar to all&dquo; (Cattell & Horn, 1978,p. 141).The Cattell Culture Fair Intelligence Test

(CCFIT) is assumed to be indifferent to culturalexperiences that might differentially influenceexaminees’ responses to its items (Cattell, 1968,p. 61) and also is assumed to be saturated withonly one single second-order intelligence fac-tor-fluid intelligence (Cattell, 1971, p. 78). Giv-en these assumptions, it could be said that it is aunifactor or unidimensional measure of fluid in-

telligence, which (according to Cattell’s theory)does not depend on cultural experiences.Though many experts in cross-cultural psy-

chology generally deny the possibility of a cul-ture-free test (Dague, 1972; Manaster & Havig-hurst, 1972), it is believed that with modernmeasurement techniques, tests could be de-

signed and administered to minimize the influ-ence of cultural experiences and to maximize theinfluence of the trait that the test is designed tomeasure.

Measurement models have been proposedwhereby objective measurement of a single in-trinsic factor (Biesheuvel, 1952) can be achieved.The logistic model attributed to Rasch (1960)specifies a probabilistic relationship between theobservable item and person scores and the unob-servable item and ability parameters assumed tounderlie the result of a person-by-item interac-tion in a testing situation. Specifically, themodel holds that, given a unidimensional test,the probability of any examinee succeeding onany item is wholly determined by his/her abilityon the trait being measured and by the difficultyparameter of the item. For this model &dquo;the esti-mate of the item difficulty parameter will notvary significantly over different samples ... [butthe] item parameter will not be sample-invariantwhen there is culture bias which differentiallyaffects the item probability and alters the [itemcharacteristic curve] slopes&dquo; (Whitely & Dawis,1974, p. 175). This implies that &dquo;if a test in-cludes items which differ in cultural loadings,the special conditions required for item pa-rameter invariance may be difficult to obtain&dquo;

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 3: A Cross-Cultural Analysis Cattell Culture Fair ...

357

(Whitely & Dawis, 1974, p. 176). Some cross-cul-tural psychologists have strongly recommendedthe use of this model in refining psychologicaltests (Eckensberger, 1972, p. 103; Fisher &

Pendl, 1980; Massad, 1978). To the extent that atest fails to meet the theoretical requirements ofculture-fairness, data from such a test will not fitthe Rasch model. Indeed, Mellenbergh (1972)was not able to fit the model to data from Dutchand Indonesian junior high students. He sug-gested that local influences, such as whether ornot specific material is taught, can randomly in-fluence data.Latent trait models, such as the Rasch, are in-

deed well formed to study cultural bias. Sincethe estimation procedures separately estimateitem difficulties and person abilities, items canbe selected that have difficulties estimated to bethe same across groups (and therefore to be un-biased ; see Ironson, 1978, for example), andthen these items can be used to study individualor group ability differences.The problem of the present study was stimu-

lated by the fact that &dquo;developing countriesshow an increasing need for tests adapted totheir own cultural conditions and requirements.As a common rule Western tests are used asthe starting point but require modifications&dquo;(Drenth & Flier, 1976, p. 137), the extent ofwhich would depend on the cultural differences.In Western cultures it has been observed that

&dquo;testing has contributed to more effective use ofmanpower, to more equal distribution of educa-tional and professional opportunities, and toidentification of talents that might otherwise re-main hidden&dquo; (Drenth, 1977, p. 23). It is gen-erally accepted that &dquo;tests should be utilized forthe same purposes in the developing nationswhose traditional cultures are evolving towardsthat in which those ability tests are developed&dquo;(Iwuji, 1978, p. 16). In most of these developingnations-Nigeria, for example-decisions haveto be made for large heterogeneous populations,and if the test scores on which such decisions arebased are observed from tests that are culturallybiased, then such decisions are not impartial.Most Western ability tests are culture-depen-

dent measures of crystallized intelligence. Usingthese in developing nations would not correct forthe bias due to differences in the cultures, qual-ity of schools, social status, sex, and other en-vironmental factors. There were, then, two re-search questions for the present study. The firstwas concerned with how well each item of theCCFIT defines the construct of fluid intelligencefor each of two disparate cultures, and secondar-ily with whether the total test is measuring thesame construct for both of the cultures. In

Rasch terminology, this concern translates toshowing how well each CCFIT item’s data fit theRasch model for each of the two groups andthen both samples combined: The extent towhich an item’s data fit the Rasch model indi-cates the accuracy with which that item definesthat single trait which the model assumes thetest is measuring.The second research hypothesis was concerned

with item bias. Rasch item parameters, as withall latent trait models, are assumed to be sampleinvariant, and for a given test the estimates ofthis parameter for different samples of ex-

aminees should not be significantly differentfrom each other; but the &dquo;item parameter willnot be sample-invariant when there is culturalbias which differentially affects the item prob-abilities&dquo; for the different sample of examinees(Whitely & Dawis, 1974, p. 175).Within the Rasch model, the hypothesis of

equal item parameter estimates could also bestated with regard to each item’s characteristiccurves derived from the data from each of thetwo groups. This is because, unlike other latenttrait models, the Rasch model assigns only oneparameter, item difficulty, to the item; that is,only item difficulties determine differences

among item characteristic curves.

Method

The primary aim of the study was to cross-vali-date the CCFIT with two mutually remote ra-cial groups that are, in addition, culturally dis-parate. The first sample was made up of 600American junior high school students from four

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 4: A Cross-Cultural Analysis Cattell Culture Fair ...

358

Portage County, Ohio, schools. The second sam-ple was made up of 803 Nigerian students fromseven secondary schools in Lagos and CrossRiver States.The two samples, composed mostly 13- to 15-

year-old boys and girls, were comparable withrespect to the number of years they had spent informal schooling. Subjects were, by necessity,volunteers, but random sampling was not seento be necessary because the Rasch model is sam-

ple free in its estimation of item parameters.Culture fairness in testing cannot be achieved

through test construction only but through acombination of test construction, test instruc-

tion, and test administration. Whereas test con-struction for culture fairness would attempt toequalize the influences of extraneous cognitivefactors-such as language, test content, and ma-terials-test instruction and administration forculture fairness would try to equalize, as muchas possible, the influence of all noncognitive fac-tors such as speed, test sophistication, motiva-tion, rapport with the examiners, understandingof instructions, and understanding of what oneis expected to do. The main aim was to rule out,or to hold constant, parameters on which cul-tures might differ, except for the trait of interest.Through construction the CCFIT purports torule out or hold constant the influence of con-

taminating cognitive factors. In this study, then,an attempt was made to control additionally forpossible noncognitive extraneous factors. In-

structions and directions supplementary to thestandardized set for the administration of thetest were given to each examiner. These allowedfor a motivation session, a practice session foreach subtest, and other specific instructions tothe examiner on the conduct of the testing. Thenormal time allowed for the testing was alsodoubled.The CCFIT is a paper-and-pencil perceptual

test that consists of abstract geometric forms. Itconsists of four subtests selected because theycorrelate well with Spearman’s general mentalcapacity. In the first, the task of the examinee isto select from five choices the one that best com-

pletes a progressive series. In the second subtest

the examinee is required to identify from amongfive figures the one that is different from theothers. The third subtest requires the examineeto complete a matrix of geometric figures bychoosing from one of five figures presented. Inthe last subtest the examinee’s task is to selectfrom five choices the one that duplicates a

topological relationship among geometric formsthat is the same as the one given.

Scale 2 Form A of the test was used. The test is

reported to have a Spearman-Brown reliabilitycoefficient of .79, a Cronbach’s alpha of .77, anda KR21 of .81 (Institute of Personality and Abil-ity Testing, 1973, p. 10).The Rasch model is based on very stringent

theoretical assumptions, the consequences of

which include such requirements as absolute in-dependence of examinees’ responses to each

item, maximum level of motivation by the ex-aminee so as to apply all his/her related abilityin answering each item of the test, no time con-straints in the testing, equality of item dis-

criminations, and the absence of guessing. Inother words, the theory considers item difficultyand person ability as the only dynamics that un-derlie the result of any person-by-item interac-tion. Such requirements could rarely be met inpractice; thus, working on the assumptions thatthey had been met constitutes a limitation to thestudy. Some of the changes made on the normalinstructions for the administration of the CCFITwere done in an attempt to make the testing con-ditions compatible with the requirements of theRasch model and culture fairness in test instruc-tion and administration.Latent trait theory assumes unidimensionality

of the test space, and since Stelzl (1979) has indi-cated that violation of this assumption can stillresult in artificial fit to the data, a first screeningof the item data’s fit to a Rasch model was a fac-tor analysis. Assuming that the trait of intelli-gence is normally distributed and that phi coef-ficients would lead to spurious results (Lord &

Novick, 1968), the matrix of tetrachoric correla-tions among items was calculated and used as

input in a principal axis factor analysis. Then,based on a rationale proposed by Wright (1977a)

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 5: A Cross-Cultural Analysis Cattell Culture Fair ...

359

and also by Wright, Mead, and Draba (1976),Rasch test and item calibrations were performedfor each of the cultural groups, the two groupscombined, and a random division of the com-bined sample.

Estimating the parameters of the model canbe done using either a procedure where the per-son’s abilities are estimated and the test’s itemdifficulties are estimated conditional on these(Anderson, 1972) or by reciprocally estimatingone set, then the other, until adjustments yieldthe desired precision. While Anderson had indi-cated that this latter (unconditional) procedureleads to inconsistent estimates, Wright andDouglas (1977) have suggested that the proce-dure is impractical for calibrating more than,say, 15 items; and, indeed, if the mean item dif-ficulty is zero, the two procedures yield no dis-cernible differences. The present data con-

formed to this criterion; hence, the present cali-

brations were performed using the uncondition-al maximum likelihood procedure as devised byWright and Panchapakesan (1969), developedand used by Wright and Mead (1978) in theircomputer program BICAL.

Results

Selected subgroups’ total test raw score meansand standard deviations are presented in Table1. For both cultural groups, there appears to beminimal variation across subgroups. For eachgroup the factor analysis yielded one factor withan eigenvalue of 41.4, accounting for 90% of theitem pool’s variance in the American sampleand an eigenvalue of 41.6 for the Nigeriangroup, also accounting for 90% of the variance.It was thus assumed reasonable to proceed withthe item calibration.

Table 1

Descriptive Statistics for Both Groups’CCFIT Scores by Sex, Age, and Grade

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 6: A Cross-Cultural Analysis Cattell Culture Fair ...

360

Item Calibration

For the American sample, after the initial pa-rameter estimation, seven persons with signifi-cant fit statistics (t > 2.00) were omitted fromthe final calibration in order to achieve more

stable item parameter estimates for the group.For the Nigerian group, 10 such scores were ob-served and these persons were removed from fi-

nal calibration. For each of the two groups, theitem parameter estimates, along with selected fitstatistics, and the discrimination index of eachitem in each of the four subtests are given inTables 2 through 5. The item discriminationsare assumed to be the same for all items within

any test used to define a variable. Although inmost Rasch analyses they are not discussed, theyare presented here as an indication of how wellindividual items met this assumption.

Table 2

CCFIT Item Parameter Estimation forSub test 1_ for Both Groups

*p < .01

aWeightedDownloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 7: A Cross-Cultural Analysis Cattell Culture Fair ...

361

Table 3CCFIT Item Parameter Estimation for

Subtest 2 for Both Groups

~p < .O1 _ -

aWeighted

Test of Fit

Two important test-of-fit statistics are asso-ciated with each Rasch item parameter estimate.One is the total fit mean square with its total fit t

statistic, and the other is the between-group fitstatistic. The total fit mean square is the Raschindex of disagreement between the variablemeasured by a given item and the one that themodel assumes. The expected value of this index

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 8: A Cross-Cultural Analysis Cattell Culture Fair ...

362

Table 4CCFIT Item Parameter Estimation for

Sub test 3 for Both Groups

*p < .01

aWeighted

is 1, and a significant total fit t statistic for anitem signifies that there is a significant dis-

crepancy between the data and the estimatedmodel. On the other hand, a significant be-tween-group fit statistic for an item indicatesthat statistically nonequivalent estimates of thatitem difficulty resulted from using differentscore groups from within the same sample(Wright & Mead, 1978, p. 77).

-...-- -_r ..-’-J

Based on the first fit criterion, no differencesbetween the expected and the observed re-

sponses for each of the items were observed, atan alpha level of .01, for either cultural group,and for all the subjects combined. That is, allthe CCFIT items fit the Rasch model for allthree analyses. But for the two groups separate-ly, a total of 13 of these &dquo;fitting&dquo; items had sig-nificant between-group fit statistics. That is to

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 9: A Cross-Cultural Analysis Cattell Culture Fair ...

363

Table 5CCFIT Item Parameter Estimation for

Subtest 4 for Both Groups

*p < .O1

aWeighted

say, each of these items, although it measuredthe same construct as the other items in the test,was inconsistent in measuring this single traitacross different levels of intelligence within thesame cultural group. Since this inconsistencymight lead to the assignment of statistically un-equivalent item parameter estimates to two

samples that might be markedly different in

ability on the construct measured, and this

might be erroneously interpreted as bias on thepart of the item, it was decided that all itemswith a significant between-group fit statistic bedropped from further analysis. This proceduremight be considered an initial screening for po-tential bias.

Detection of Bias

Once the data sets had been cleared of irregu-lar data, item difficulty estimates for each of thetwo cultural groups were compared using the ra-tionale proposed by Wright (1977). Since esti-mates of item parameters now would not varysignificantly unless the item differed in &dquo;cul-

tural&dquo; loadings (Whitely & Dawis, 1974), anysignificant differences in item parameter esti-mates would indicate cultural bias in that item.The parameters for 22 items were judged to besimilar for the two cultural groups and different

(that is, &dquo;biased&dquo;) for 11 items. In a parallelanalysis, when similar comparisons were per-

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 10: A Cross-Cultural Analysis Cattell Culture Fair ...

364

formed with the parameter estimates for tworandomly formed groups, no sign of any biaswas observed for any item, and this suggestedthat the cultural division was valid.Six of the 11 items were seen to be &dquo;biased&dquo; in

favor of the American group, while five wereseen to be biased in favor of the Nigerian group.Those biased against the Nigerian group wereItem 8 on Subtest 1, Items 2, 3, and 8 on Subtest2, Item 10 in Subtest 3, and Item 3 on Subtest 4.Those biased against the American group wereItem 10 on Subtest 1, Items 4, 7, and 8 on Sub-test 3, and Item 1 on Subtest 4. The overallmean difference in difficulty parameterestimates was -.023; and although this was notsignificant, the difference was in favor of theAmerican group.For further analysis those 22 items identified

to be unbiased were combined into a subset andtermed &dquo;neutral&dquo; items, and those that favoredthe American group were combined into a sub-set of &dquo;American&dquo; items. Similarly, a subset of&dquo;Nigerian&dquo; items was formed.All the CCFIT items were arranged in order of

degree of fit, and the items in the first and

fourth quartiles were isolated. With this divi-sion, a distribution of those items over the foursubtests showed that Subtest 3 was the Rasch

best-fitting subtest, whereas Subtest 2 was theRasch poorest-fitting subtest. The KR20 relia-bility coefficient for the 22 neutral items for theAmerican and the Nigerian groups, when cor-rected for test length, were .84 and .81, respec-tively. This showed a change, especially for theNigerian group, from the original total test relia-bilities of .82 and .75, respectively.T tests done between the two groups on the set

of neutral items, the set of American items, theset of Nigerian items, and the total CCFITscores showed that the two groups differed sig-nificantly on total CCFIT scores and on both thesets of American and Nigerian items but did notdiffer in the sets of neutral items (see Table 6).With the total CCFIT scores, the difference wasnot substantial (co2 = .011) and might have beenthe effect of large sample sizes. The Americangroup scored significantly higher on both the to-

tal CCFIT scores and on the set of Americanitems, while the Nigerian group scored signifi-cantly higher on the set of Nigerian items. Theseresults indicate that the Rasch model could beused to select a set of (22) items that correlatedless with culture than does the total test (seeTable 7).To give an idea of the expected range of the

item difficulties within the present data, boththe American and Nigerian samples were ran-domly divided in half, and item parameters werecalculated for all four resulting samples; withineach cultural group, differences in item diffi-culties were calculated for each item. For eachitem of the test, this within-culture differencewas not significant. Table 8 presents the meansand standard deviations for these data. The

homogeneity of the differences and standarderrors suggests that the results isolating biaseditems (Tables 2 through 5) were legitimate.

Discussion and Conclusions

Although a significant difference between theAmerican and Nigerian groups was observed forthe Rasch poorest-fitting subtest (Subtest 2) infavor of the American group, no such differencewas found for the Rasch best-fitting subtest

(Subtest 3). Whether or not a mean difference of.52 points is substantial enough to be concernedabout is debatable. For Subtest 3 this meansone-half point out of 14 questions. For the totaltest the significant difference was 1.03 points outof a total of 46. This is some indication that testsof significance, especially overall tests (Stelzl,1979), may not be sensitive to differences. Thedata presented here, however internally consis-tent, might still have items with different char-acteristics that were undetected. The questionthat might be raised involves the degree to whicha test can ever be perfectly unbiased.The data do raise several further interesting

questions. Cattell and Horn (1978) referred tothe test as a perceptual test of intelligence as op-posed to a verbal test. Subtest 2, that favoringthe American children, is basically classifica-tions (&dquo;Which of five stimuli does not belong?&dquo;);

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 11: A Cross-Cultural Analysis Cattell Culture Fair ...

365

aCulture (U.S.A. = 1, Nigeria = 2)

Table 6T-test of Differences Between the Two Cultures in Total CCFIT

Scores, &dquo;Neutral&dquo; Items, Rasch Best-Fitting Subtest,Rasch Worst-Fitting Subtest, &dquo;American&dquo;

Items, and &dquo;Nigerian&dquo; Items

Table 7

Correlations Between Culture, &dquo;Neutral,&dquo; &dquo;American,&dquo;and &dquo;Nigerian&dquo; Items, Subtest 2, Subtest 3, and

_

the Total CCFIT Scores

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 12: A Cross-Cultural Analysis Cattell Culture Fair ...

366

(U..c:4J

10Gco

-nw’5*N r--1 NUcueri Nt44 COaLw4aP4-1 ’r-! ~ (D4-IA~(])’r-! PA

~4 4-1 0.. ::s~ ~ ~ U)o H 04a }-< &dquo;0

00 p b0N rG td

(L) C: a) =1rl O 3 cn C14-r4 4.J :icÖ +J (]) &dquo;C 0E-40 Pq Q) P

·rl 4-1>WU(]) 0 (]) Mm *r4 n cÖ

4-’ U ~·d cd ::s!-< P~ >~ -’-’u MM

*m4gne O UcÖ 00 &dquo;C4.J

cn 4-14J M&dquo;CMGcd

N 4aP4cCd

N

while Subtest 3, that favoring the Nigerian chil-dren, was progressive matrices. Of substantiveinterest is the question of the geometric natureof the stimuli and the classification nature of

Subtest 2 compared with the deductive, ex-

trapolative nature of Subtest 3. It might be valu-able to determine if these differences replicateand then, if they do, to determine the extent theywould replicate with other perceptual stimuli. Ifthese differences are attributable to the differ-ences between the English-style and Americanschool systems, the effect of culture on intel-

legence would be far more subtle and indirectthan the obvious but superficial influences oflanguage, school content, and politics.Piswanger (cited in Fischer & Pendl, 1980)

described a similar conclusion: His sample ofAfrican students appeared to be identical to agroup of Austrian students in progressivematrices tests except that the Africans per-formed more poorly when the changes involvedhorizontal movement. The author suggestedthat this was a result of the stress put on readingin Austria.

Compared to the other items of CCFIT, the setof those items that showed a better fit to theRasch model-that is, provided a better mea-sure of fluid intelligence-did not differentiatebetween the two groups. This observation sup-ported Cattell’s contention that uncontaminatedfluid intelligence is not culture dependent. Butthe apparent overall lack of culture bias oftenobserved for CCFIT might result from interitemneutralization of significant biases, in opposingdirections, of most items in the test. The initialfactor analysis, which (according to Jensen,1980) would indicate measurement of a singlecriterion, did not indicate this. Jensen (1980) didstate, however, that such biases that balancethemselves out &dquo;cannot be claimed to measureone and the same ability on the same scale inboth groups&dquo; (p. 445); this statement would thensuggest the Rasch item selection procedure,compared to a factor analysis, would yield a testwith less culturally loaded items.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 13: A Cross-Cultural Analysis Cattell Culture Fair ...

367

References

Anderson, E. B. The numerical solution of a set ofconditional estimation equations. The Journal ofthe Royal Statistical Society, Series B, 1972, 34,42-54.

Anderson, E. G. Discrete statistical models with so-cial science applications. Amsterdam: North-Hol-land, 1980.

Biesheuvel, S. Adaptability: Its measurement anddeterminants. In L. J. Cronbach & P. J. D. Drenth(Eds.), Mental tests and culture adaptation. TheHague: Mouton, 1952.

Cattell, R. B. A culture free intelligence test I.Journal of Educational Psychology, 1940, 31,161-180.

Cattell, R. B. Are IQ tests intelligent? Psychology To-day, 1968, 2, 56-62.

Cattell, R. B. Abilities: Their structure, growth, andaction. Boston: Houghton Mifflin, 1971.

Cattell, R. B. Are culture fair intelligence tests pos-sible and necessary? Journal of Research and De-velopment in Education, 1979, 12, 2-13.

Cattell, R. B., & Horn, J. L. A check on the theory offluid and crystallized intelligence with descriptionof new subtest designs. Journal of EducationalMeasurement, 1978, 15, 139-164.

Dague, P. Development, application and interpreta-tion of tests for use in French-speaking blackAfrica and Madagascar. In L. J. Cronbach & P. J.D. Drenth (Eds.), Mental tests and cultural adap-tation. The Hague: Mouton, 1972.

Davis, A. Social class influence upon learning.Cambridge MA.: Harvard University Press, 1948.

Davis, W. A., & Havighurst, R. J. The measurementof mental systems. Science Monograph, 1948, 66,301-316.

Drenth, P. J. D. The use of intelligence tests in devel-oping countries. In Y. H. Poortinga (Ed.), Basicproblems in cross-cultural psychology. Amster-dam : Swet & Zeitlinger B. V., 1977.

Drenth, P. J. D., & Van der Flier, H. Cultural differ-ences and comparability of test scores. Interna-tional Review of Applied Psychology, 1976, 25,137-144.

Eckensberger, L. H. The necessity of a theory for ap-plied cross-cultural research. In L. J. Cronbach &P. J. D. Drenth (Eds.), Mental tests and culturaladaptation. The Hague: Mouton, 1972.

Eells, K., Davis, A., Havighurst, R. J., Herrick, V. E.,& Tyler, R. Intelligence and cultural differences.Chicago: University of Chicago Press, 1951.

Fischer, G. H., & Pendl, P. Individualized testing onthe basis of the dichotomous Rasch model. In L.

Van der Kamp, W. Langerak, & D. de Gruijter(Eds.), Psychometrics for educational debates.New York: Wiley, 1980.

Horn, J. L., & Cattell, R. B. Refinement and test ofthe theory of fluid and crystallized intelligence.Journal of Educational Psychology, 1966, 57,253-270.

Hunt, M. McV. Heredity, environment, and class orethnic differences. In Assessment in a pluralisticsociety. Proceedings of the 1972 invitational con-ference on test problems. Princeton NJ: Educa-tional Testing Service, 1972.

Institute of Personality and Ability Testing (IPAT).Measuring intelligence with the culture fair tests.Manual for Scales 2 and 3. Chicago: Author,1973.

Ironson, G. H. A comparative analysis of severalmethods of assessing item bias. Paper presented atthe American Educational Research Associationannual convention, Toronto, March 1978.

Iwuji, V. B. C. An exploratory assessment of the ef-fects of specific cultural and environmental fac-tors on the performance of Nigerian students onselected verbal and nonverbal aptitude tests de-veloped for use in Western countries. Unpublisheddoctoral dissertation, University of Chicago, 1978.

Jensen, A. J. Bias in mental testing. New York: TheFree Press, 1980.

Jordan, A. M. Parental occupation and children’s in-telligence scores. Journal of Applied Psychology,1933,17, 103-119.

Leahy, A. M. Nature-nurture and intelligence.Genetic Psychological Monograph, 1935, 17,236-308.

Lord, F. M., & Novick, M. Statistical theories of men-tal test scores. Reading MA: Addison-Wesley,1968.

Manaster, G. J., & Havighurst, R. J. Cross nationalresearch: Social-psychological methods and prob-lems. Boston: Houghton Mifflin, 1972.

Massad, C. E. International scene. MeasurementNews, 1978, 9, 3.

Mellenbergh, G. J. Applicability of the Rasch modelin two cultures. In L. J. Cronbach, & P. J. D.Drenth (Eds.), Mental tests and cultural adapta-tion. The Hague: Mouton, 1972.

Neff, W. S. Socioeconomic status and intelligence: Acritical survey. Psychological Bulletin, 1938, 35,727-757.

Rasch, G. Probablistic models for some intelligenceand attainment tests. Copenhagen: DanishInstitute for Educational Research, 1960.

Stelzl, I. Ist der Modelltest des Rasch-Modells

geeignet, Homogenitatshypothesen zu prufen? Ein

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/

Page 14: A Cross-Cultural Analysis Cattell Culture Fair ...

368

Bericht &uuml;ber Simulationsstudien mit inhomoginenDeten. [Can tests of significance for the Raschmodel be used to test assumptions of homo-

geneity ? A report about simulation studies usingnonhomogeneous data.] Zeitschrift f&uuml;r experi-mentelle and angewandte Psychologie, 1979, 26,652-672.

Wellman, B. Growth in intelligence under differingschool environments. Journal of Experimental Ed-ucation, 1934, 3, 59-83.

Wellman, B. Mental growth from pre-school to

college. Journal of Experimental Education, 1937,6, 127-138.

Whitely, S. E., & Dawis, R. V. The nature of ob-jectivity with the Rasch model. Journal of Educa-tional Measurement, 1974,11, 163-178.

Wright, B D. Solving measurement problems withthe Rasch model. Journal of Educational Mea-surement, 1977,14, 97-116.

Wright, B. D., Mead, R., & Draba, R. Detecting andcorrecting test item bias with a logistic responsemodel (Research Memorandum No. 22). Univer-sity of Chicago, Department of Education, Statis-tical Laboratory, 1976.

Wright, B. D., & Douglas, G. A. Conditional versusunconditional procedures for sample-free itemanalysis. Educational and Psychological Measure-ment, 1977, 37, 61-66.

Wright, B. D., & Mead, R. J. BICAL: Calibratingrating scales with the Rasch model (ResearchMemorandum 23A). Chicago: University of Chi-cago, Department of Education, Statistical Labo-ratory, 1978.

Wright, B. D., & Panchapakesan, N. A procedure forsample-free item analysis. Educational and Psy-chological Measurement, 1969, 29, 23-48.

Author’s Address

Send requests for reprints and further information toThomas E. Dinero, Associate Professor, EducationalFoundations, 406 White Hall, Kent State University,Kent OH 44242; or H. Johnson Nenty, Nigerian Na-tional Youth Service Corps at Federal GovernmentCollege, PMB 5126, Port Harcourt, Rivers State,Nigeria.

Downloaded from the Digital Conservancy at the University of Minnesota, http://purl.umn.edu/93227. May be reproduced with no cost by students and faculty for academic use. Non-academic reproduction

requires payment of royalties through the Copyright Clearance Center, http://www.copyright.com/