THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

12
THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN 1 HAROLD SEASHORE, ALEXANDER WESMAN AND JEROME DOPPELT THE PSYCHOLOGICAL CORPORATION T HE Wechsler Intelligence Scale for Chil- dren has grown logically out of the Wechsler-Bellevue Intelligence Scales used with adolescents and adults [4]. In fact, most of the items in the WISC are from Form II of the earlier scales, the main additions being new items at the easier end of each test to per- mit examination of children as young as five years of age. Even though the materials overlap, the WISC is a distinct test from the Wechsler- Bellevue Scales and is independently standard- ized. The scales overlap in usefulness since both scales can be used with adolescents. How- ever, it is expected that the WISC will be pre- ferred in testing adolescents up through the age of fifteen years. This new Children's Scale (as it probably will come to be known in every-day clinical parlance) has been standardized with excep- tional care over a five-year period of experi- mental tryouts, field testing, and statistical an- alysis. In this paper some of the principal re- search data are reported. The WISC consists of twelve tests which, as in the Adult Scale, are divided into two subgroups identified as Verbal and Perform- ance. In the standardization, there were six tests in each of the subgroups: Verbal Performance General Information Picture Completion General Comprehension Picture Arrangement Arithmetic Block Design Similarities Object Assembly Vocabulary Coding (Digit Span) (Mazes) In the interest of shortening the time re- quired for examination, the Scale is to be ad- 1 This report of the standardization is an expan- sion of technical sections in the test manual: David Wechsler, Wechsler intelligence scale for children. New York: Psychological Corporation, 1949. Pp. 113. ministered ordinarily on the basis of only ten tests. For various statistical and practical reasons, Digit Span is considered an alternate test in the Verbal series and Mazes an alter- nate in the Performance series. As a matter of fact, the reasons for including Coding or Mazes are about equally good except that the Mazes take a considerably longer time than Coding and will probably, therefore, not be preferred. The conditions for using the alter- nate tests are described in the manual. The reader is referred to the manual and to Wechsler's earlier text for a more complete description and discussion of the tests. THE STANDARDIZATION SAMPLE Age of Children and Size of Sample. The WISC was standardized on a sample of 100 boys and 100 girls at each age from five through fifteen years. Each child was tested within one and one-half months of his mid- year ; e.g., the five-year-olds were past 5 years, 4 months and 15 days but were not yet 5 years, 7 months and 15 days. The feebleminded cases were exceptions as an adequate sample could not be secured without permitting more varia- tion ; nearly all, however, were within two months of their mid-year. There were 1100 boys and 1100 girls in eleven age groups, a total of 2200 cases. Actually more cases were tested, but the final sample includes those who best satisfied the other sampling requirements described below. Only white children were ex- amined. Variables of Sampling. It was determined at the beginning that for each age (as closely as practicable) and for the total sample, the se- lected cases should meet certain sampling re- quirements based on U.S. Census Bureau data for 1940, with some adjustment for the recent shift of population toward the West. 99

Transcript of THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

Page 1: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

THE STANDARDIZATION OF THE WECHSLERINTELLIGENCE SCALE FOR CHILDREN1

HAROLD SEASHORE, ALEXANDER WESMAN AND JEROME DOPPELTTHE PSYCHOLOGICAL CORPORATION

THE Wechsler Intelligence Scale for Chil-dren has grown logically out of the

Wechsler-Bellevue Intelligence Scales usedwith adolescents and adults [4]. In fact, mostof the items in the WISC are from Form IIof the earlier scales, the main additions beingnew items at the easier end of each test to per-mit examination of children as young as fiveyears of age.

Even though the materials overlap, theWISC is a distinct test from the Wechsler-Bellevue Scales and is independently standard-ized. The scales overlap in usefulness sinceboth scales can be used with adolescents. How-ever, it is expected that the WISC will be pre-ferred in testing adolescents up through the ageof fifteen years.

This new Children's Scale (as it probablywill come to be known in every-day clinicalparlance) has been standardized with excep-tional care over a five-year period of experi-mental tryouts, field testing, and statistical an-alysis. In this paper some of the principal re-search data are reported.

The WISC consists of twelve tests which,as in the Adult Scale, are divided into twosubgroups identified as Verbal and Perform-ance. In the standardization, there were sixtests in each of the subgroups:

Verbal PerformanceGeneral Information Picture CompletionGeneral Comprehension Picture ArrangementArithmetic Block DesignSimilarities Object AssemblyVocabulary Coding(Digit Span) (Mazes)

In the interest of shortening the time re-quired for examination, the Scale is to be ad-

1This report of the standardization is an expan-sion of technical sections in the test manual: DavidWechsler, Wechsler intelligence scale for children.New York: Psychological Corporation, 1949. Pp.113.

ministered ordinarily on the basis of only tentests. For various statistical and practicalreasons, Digit Span is considered an alternatetest in the Verbal series and Mazes an alter-nate in the Performance series. As a matterof fact, the reasons for including Coding orMazes are about equally good except that theMazes take a considerably longer time thanCoding and will probably, therefore, not bepreferred. The conditions for using the alter-nate tests are described in the manual.

The reader is referred to the manual andto Wechsler's earlier text for a more completedescription and discussion of the tests.

THE STANDARDIZATION SAMPLE

Age of Children and Size of Sample. TheWISC was standardized on a sample of 100boys and 100 girls at each age from fivethrough fifteen years. Each child was testedwithin one and one-half months of his mid-year ; e.g., the five-year-olds were past 5 years,4 months and 15 days but were not yet 5 years,7 months and 15 days. The feebleminded caseswere exceptions as an adequate sample couldnot be secured without permitting more varia-tion ; nearly all, however, were within twomonths of their mid-year. There were 1100boys and 1100 girls in eleven age groups, atotal of 2200 cases. Actually more cases weretested, but the final sample includes those whobest satisfied the other sampling requirementsdescribed below. Only white children were ex-amined.

Variables of Sampling. It was determined atthe beginning that for each age (as closely aspracticable) and for the total sample, the se-lected cases should meet certain sampling re-quirements based on U.S. Census Bureau datafor 1940, with some adjustment for the recentshift of population toward the West.

99

Page 2: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

100 H. SEASHORE, A. WESMAN, AND J. DOPPELT

1. Areas. The states were divided into four geo-graphic areas as defined in Table 1.

2. Urban-Rural. The urban-rural proportions ofthe total U. S. 1940 population were used as shownin Table 2.

3. Parental Occupation. The children's fatherswere to be occupationally distributed similarly toall employed white males. The fourteen U. S. Censuscategories were reduced by combinations into nine,as shown in the footnote of Table 4. The quota foreach geographic area was further defined in termsof Census reports on employment within each area.

Drawing the Sample. With these controls,tables of requirements were set up for examin-ers in each area. It was not expected that theoccupational and urban-rural requirementswould be exactly satisfied in each area, but thatthe over-all conditions would be met by thenational sample.

Specific directions and worksheets for draw-ing a sample in a given school were providedso that cases would not be "volunteered" or"thrown in" at the whim of the examiner orschool official.

Most of the 55 feebleminded cases were ex-amined at the Illinois State School, Lincoln,Illinois; at Letchworth Village, New York;and at the Wayne County Training School,Michigan; a few selected cases from "specialclasses" of two public schools were included.The staff psychologists in the institutions aidedin the selection of cases of the required ageswho were rated as having IQ's under 70 andnot below 50. Cases where postnatal diseaseor accident were considered causative of thedeficiency were omitted. The number of feeble-minded cases appearing in the regular schoolsampling was not determined. No examiner as-signed to public schools reported that any casewas officially labelled as feebleminded. In all,then, 2.5 per cent of the total number of casesin the standardization population is known tobe feebleminded.

Analysis of the Obtained Sample. Tables 1,2, 3, 4 and 5 present the sampling data on the2200 cases finally included in the standardiza-tion group.

In Table 1 it is seen that the midwest sample(Area II) is slightly short of cases; this wasdeliberate in order to increase the western pro-

TABLE 1SAMPLE BY GEOGRAPHIC AREA

Per Centin U.S.

Population

I New England andMiddle Atlantic States

II North Central States

III South Atlantic andSouth Central States

IV Mountain and PacificStates

29.2

32.7

26.8

11.3

100.0

Wechsler%

31.0

28.9

26.5

13.6

100.0

SampleN

683

636

583

299

2200

TABLE 2SAMPLE BY URBAN-RURAL RESIDENCE

Per Centin U.S. Wechsler Sample

Population % N

Urban 57.9 60.8 1327Rural 42.1 37.2 818Institutional* — 2.6 66

100.0 100.0 2200

*The 55 feebleminded cases from institutions were notreported as either rural or urban.

TABLE 3PROPORTION OF URBAN AND RURAL CASES EXPECTED

AND OBTAINED IN EACH AREA

ABBA

III

IIIIV

Ex-pected

%

38.432.917.211.6

100.0

URBANWech-schler

%

34.632.917.515.1

100.0

Sam-pleN

4584372S2200

1327

RURALEx-

pected%

16.232.540.011.3

100.0

Wech-schler

%

28.421.642.912.1

100.0

Sam-pleN

19117735199

818

portion in accord with wartime and postwarpopulation shifts. All in all, the area samplingis eminently satisfactory.

Table 2 reveals that in the national stand-ardization an appropriate number of ruralchildren was included. Table 3 presents theurban-rural data in a different analysis. Thistable shows what percentage of the total sam-ple of urban cases was expected from each ofthe areas and what percentages were obtained.The rural sources are analyzed similarly. Theonly large differences are the relatively highrural contributions by Area I and a shortagein Area II.

Page 3: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 101

The values in Table 3 are slightly distortedbecause the 55 feebleminded cases (2.5 percent of all cases) were not allocated by resi-dence.

The reader should also be reminded thatthe rural category includes all cases living onfarms and in communities of less than 2500

population. "Bedroom villages" attached tolarge cities were generally avoided.

The statistical significance of the discrep-ancies between expected cases and obtainedcases could be computed, but this seems inap-propriate because the expected percentagesthemselves are quite imperfect criteria. Alloca-

TABLE 4OCCUPATION OF FATHERS OF CHILDREN IN STANDARDIZATION SAMPLE

Employed Wechsler SampleOccupational

Groups*

1234s6789

( Feebleminded )

N

U. S. Males All Cases% %

5.9 8.0. .. 14.0 10 0

10.6 11.613.9 12.715.6 17.918.8 16.56.0 5.5

14.5 13.8.7 1.4

— 2.5

All CasesN

176222256280393363122303

3055

2200

Wechsler SampleBoys

%

7.910.311.912.518.816.65.5

1241 5

2.5

1100

Girls

%

8.19.9

11.412.916 916.45 5

15 21 22 5

1100

*A consolidation of 14 Census groups, 1940:1. (I and II)2. (Ill)3. (IV)4. (V)5. (VI)6. (VII)7. (VIII, IX and X)8. (XI, XII and XHI)9. (XIV)

Professional and aemiprofessional workersFarmers and farm managersProprietors, managers and officialsClerical, sales and kindred workersCraftsmen, formen and kindred workersOperatives and kindred workersDomestic, protective and other service workersFarm laborers and foremen, and laborersOccupation not reported

TABLE 5OCCUPATION OF FATHERS OF CHILDREN IN STANDARDIZATION SAMPLE

AREA I AREA II AREA III AREA IVOccupational Expected Obtained* Expected Obtained Expected Obtained Expected Obtained

Groups % % % % % % % %

1234567S9

N(Feebleminded)

7.03.611.116.618.123.17.711.9

.8

640

8.63.711.216.918.219.67.912.01.8

64934

5.317.210.013.115.418.15.814.4.7

720

6.012.910.39.322.818.74.114.51.5

61421

4.722.99.811.712.715.85.116.6

.7

590

10.616.513.210.316.113.43.414.91.5

583

6.810.012.014.016.016.48.815.6.4

250

7.07.714.417.713.714.48.716.4

—299

—'Obtained percentages computed on N without feebleminded cases.

Page 4: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

102 H. SEASHORE, A. WESMAN, AND J. DOPPELT

tions based on 1940 Census data are the bestcriteria the authors could set for control of thesampling. But we cannot readily know wheth-er a discrepancy of, say, five percentage points,is due to obsolete Census data or to faultysampling. For that reason one must be satisfiedwith reasonable approximations. There isreason to believe that there has been sometrend toward urbanization since 1940, andthat the five per cent shortage of rural casesis, therefore, an exaggeration as of 1947-1948.

Tables 4 and 5 describe the sample by occu-pation of the father. Again, reasonable agree-ment between the Census expectancy and theactual sample is evident. The greatest shortageis in Occupational Group 2, farmers and farmmanagers. If one mentally combines the per-centages of Occupations 3 and 4, as seems logi-cal from the descriptions, the expected percent-age would be 24.5 and the obtained percentage24.3. Similarly, if one combines OccupationalGroups 5 and 6, the expected percentage is34.4, which is identical to the obtained per-centage. A slight error in the obtained percent-ages occurs because the 2.5 per cent of feeble-minded cases were not allocated by parentaloccupation. In the last columns of Table 4,the analysis shows that the percentages of boys

and girls with respect to parental occupationare very similar.

Table 5 presents the occupational samplingby area. The agreements are not as good hereas for the national sample, but there are nogross miscarriages of sampling.

The field examiners used utmost care in as-certaining the father's occupation and wereasked to write descriptions in considerable de-tail. The final classifications were made withthe detailed Census descriptions at hand. Notethat the field examiners were able to securerather good samples in category 8, farm labor-ers and other laborers; this is usually a diffi-cult task. Similarly, the excess in Occupation 1(which is usually a category that one finds diffi-cult to keep small enough) is considerable onlyfor Area III.

The best available base for the selection ofthis sample was the occupational distributionas reported in the United States Census. How-ever, one should not take percentages derivedfrom those data as being absolute criteriaagainst which to select cases. There obviouslyhave been shifts in occupational percentagesbecause of the war, and also shifts in occupa-tional groups from area to area. Certain partsof the country have become more industrializ-

TABLE 6CORRELATIONS OF EACH TEST WITH THE VERBAL, PERFORMANCE AND FULL SCALE SCORES

100 BOYS AND 100 GIRLS AT EACH AGE

AgeVERBAL*

71/2 10 1/2 13 1/2PERFORMANCEf

7 1/2 10 1/2 13 1/2FULL SCALE}

7 1/2 10 1/2 13 1/2

Verbal ScoreInformationComprehension _ArithmeticSimilarities „VocabularyDigit Span

Performance ScorePicture Completion _Picture ArrangementBlock DesignObject AssemblyCoding|| •»• -Mazes .. .

—.64.49.55.55.66.48

.60

.42

.51.42.38.31.31

—.82.70.70.72.82.50

.68

.45

.58.55.38.42.43

—.80.68.59.74.75.44

.56

.38

.43.50.31.42.40

.60

.44

.46

.46

.41

.47

.45

—.34.51.53.59.32.51

.68

.59

.56

.57

.48

.68

.40

—.48.53.66.52.35.55

.56

.51

.37

.38

.52

.51

.29

—.55.51.65.68.42.39

—.59.54.57.53.63.52

—.43.58.52.52.35.46

__

.77

.69

.69

.65

.83

.50

—.51.62.64.47.43.53

.73

.58

.55

.71

.70

.42_

.51

.53.64.52.48.44

*Sura of 5 tests, Digit Span omitted.fSum of 5 testa, Mazes omitted.jsum of 10 tests, Digit Span and Mazes omitted.BCodlng A at age 7 1/2 ; Coding B at ages 10 1/2, 18 1/2.

Data collection window appears to have been 1947 to 1948
Page 5: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 103

ed. There has also been a considerable shift inthe west coast total population.

The sampling requirements were defined interms of the occupations of the fathers, and ofurban-rural residence of workers, but this doesnot mean that children across the country aredistributed in the same percentages. It is gener-ally known that rural families, laboring famil-ies, and perhaps southern families have morechildren than urban, upper middle class, andnorthern families. The data for making ad-justment in quotas are complicated and incom-plete. Modification of the percentages in eachoccupational category, and in the urban andrural categories to account for differentialfecundity did not seem feasible.

TABLE 7MEDIAN CORRELATION COEFFICIENTS BETWEEN TESTS

AND VERBAL, PERFORMANCE AND FULLSCALE SCORES

Perform- FullVerbal ance ScaleScore Score Score

abilities measured.The correlations between the Verbal Score

and the Performance Score are sufficientlyhigh (.60, .68, .56 for the three ages) to indi-cate considerable common variance, yet arelow enough to suggest that the abilities includ-ed in V and P cannot be readily inferred fromeach other. Both classes of abilities need to betapped in an over-all appraisal of abilities.

These data also indicate that the Digit Spanis the least like the other Verbal tests; for thisreason it was made an alternate. The Codingand Maze tests are about equally eligible to re-main in the Performance Scale, with the pre-ference going to Coding on the basis of ease ofscoring and brevity.

RELIABILITY

The reliability coefficients of the individual

TABLE 8RELIABILITY AND STANDARD ERROR OF MEASURE-

MENT* OF THE WISC TESTSN — 200 FOR EACH AGE LEVEL

Median r ofVerbal tests 67 .46 .61

Median r ofPerformance tests 42 .51 .52

INTERCORRELATIONS OF THE TESTS WITH VER-

BAL, PERFORMANCE AND FULL SCALE SCORES

Table 6 is constructed to show the relation-ship of each test with the three special scores.When a test is correlated with the composite ofwhich it is also a contributing member (e.g.,Vocabulary with the Verbal Score) a correc-tion [2] for spuriousness has been applied. Thedata are presented for three representativeages. In the manual more complete tables areprinted showing the intercorrelations of thetests themselves. Table 7 may be helpful incomprehending the mass of coefficients inTable 6.

The Verbal tests correlate more highly withthe Verbal Score than with the PerformanceScore, and likewise the Performance tests cor-relate more highly with the Performance Scorethan with the Verbal Score. This is as onewould expect. It does appear that the Verbaltests are somewhat more homogeneous as to

Age 7 1/2

InformationComprehensionArithmeticComparisonVocabularyDigit Span

Verbal Score(without DigitSpan)

Picture CompletionPicture Arrange-

mentBlock DesignObject AssemblyCodingfM azes

PerformanceScore

(without Codingand Mazes)

Full Scale Score(without DigitSpan, Codingand Mazes)

r

.66.59.63.66.77.60

.88

.59

.72

.84.63.60.79

.86

.92

SE™

1.761.921.821.751.442.46

5.19

1.92

1.591.201.821.901.87

5.61

4.25

Age 10 1/2r

.80.73.84.81.91.59

.96

.66

.71

.87

.63

.81

.89

.95

*SEm is in Scaled Score units forIQ units for the Verbal,Scores.

Performance

fBased on correlating Codingoases. See text for explanation.

SE»

1.341.561.201.31.90

1.92

3.00

1.75

1.621.081.82

1.31

4.98

8.36

Age 13 1/2r

.82

.71.77.79.90.50

,96

.68

.72

.88

.71

.75

.90

.94

the tests iand Full

SEm

1.271.621.441.37.95

2.12

8.00

1.70

1.591.041.62

1.50

4.74

3.68

Mid inScale

A and Coding B, 115For use 8 1/2 the

value is .56 for 91 cases.

Page 6: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

104 H. SEASHORE, A. WESMAN, AND J. DOPPELT

tests and of the Verbal, Performance and FullScale Scores are presented in Table 8 for ages7/4, 10!/2 and l3l/2. These three ages were se-lected for presentation as being probably mostrepresentative of the age range for which theWechsler Intelligence Scale for Children is de-signed. Reliability coefficients have been com-puted by the split-half technique, with appro-priate correction for full length of the test bythe Spearman-Brown formula.

This technique could not legitimately beused for estimating the reliability of the Codingtest, which is essentially a speed test; nor didthe Digit Span test lend itself to such treatmentbecause of its administration as two separatesubtests—Digits Forward and Digits Back-ward. The reliability coefficients reported forCoding were made possible because, for age71/2 and 8'/2, many of the children were givenboth Coding A and Coding B. (See adminis-trative manual for description of these tests.)The reported values thus are based on an alter-nate test situation. The coefficients presumablywould be a little higher if scores on Coding Awere correlated with scores on a strict alternateform. The reliability coefficients shown for theDigit Span test are based on the correlation be-tween scores on Digits Forward and scores onDigits Backward corrected according to theSpearman-Brown formula.

For the composite scores (Verbal, Perform-ance and Full Scale Scores) the sum of thescores on odd items in the contributing testswere correlated with the sum of the even items.

The reliability coefficients presented in thesetables should be carefully considered by theconscientious clinician when interpreting thescores earned on separate tests, or differencesbetween scores. The smaller the reliability of agiven score, the less confidence one can have inthe judgments made concerning a child's trueability based on that particular test. Judgmentswith respect to differences between scores ontwo tests of moderate reliability must be madewith considerable caution—the lower the re-liability of the scores, the more likelihood thereis that the difference between them is due tochance rather than to any real difference in theabilities possessed by the child. As may be seenby reference to the reliability table, this cau-

tion is more necessary for some tests than forothers. It is least necessary when working withthe composite Verbal, Performance and FullScale Scores, which are highly reliable.

As another statement of the stability ofscores on the Wechsler Intelligence Scale forChildren, Table 8 presents the standard errorof measurement by test and age. This measureindicates the band of error which surroundsthe child's test score. Thus, a SEm of 1.75 for71/2-year-olds on Information indicates thatthe chances are about two out of three that atrue score on this test is within 1.75 points ofthe obtained scaled score. One can be highlycertain that the true score is within 5.25 pointsof the obtained score (5.25 is three times theSEOT of 1.75). Note that there are considerabledifferences in size of SEm from test to test. Forexample, confidence in the stability of BlockDesign for 7 Va-year-olds is permissible withinlimits of ±1.20 (chances two out of three)and ±3.60 (high certainty). Obviously, thesmaller the SEOT, the less allowance one needsto make for unreliability of the score. Differ-ences between Block Design and Vocabularyscores are less likely to be due to chance thanare differences between scores on Object As-sembly and Comprehension. These facts callfor special wariness in attempts to comparedifferences between test profiles.

The reader of Table 8 should not be con-fused by the discrepancy between the size ofthe SEm for the individual tests, as contrastedwith the SEm for Verbal, Performance andFull Scale IQ's. For individual tests, the SEm

is in scaled score units; for the IQ's the SEm isin IQ units, which are the ones in which mosttest users are interested. Thus, the SEm of5.19 for the Verbal IQ of 71/2-year-olds indi-cates that the true IQ is probably (chancestwo out of three) within 5 points of IQ of theobtained IQ.

SCALED SCORES

The scaled scores have been so derived as toprovide, at each age and for each of the separatetests, a mean scaled score of 10 and a standarddeviation of 3. This was accomplished by pre-paring a cumulative frequency distribution ofraw scores for each test at each age level and

Page 7: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 105

setting each percentile point at its appropriatestandard score value on a theoretical normalcurve with a mean of 10 and standard devia-tion of 3. Scores for all ages on a single testwere then listed in parallel columns and minorirregularities in the progression of scaled scoreequivalents from age to age were smoothed.The assumption that these irregularities werechance results of population sampling seemedto be the only tenable position. Few instancesof such minor deviations were found, and theScales are essentially a direct translation fromraw scores to a normalized distribution ofscaled scores with a mean of 10 points andstandard deviation of 3 points.

The manual presents 33 tables for convert-ing raw scores on each test into scaled scores.The tables for the mid-years (age of testing)were first made. Then tables for each four-month span were constructed by interpolation;thus there are tables for 6-0 through 6-3; 6-4through 6-7, 6-8 through 6-11, etc. After se-curing raw scores on each test, the examinerconverts them to scaled scores by using the ap-propriate age table for the subject. Thesescaled scores, then, are the basis for determin-ing the IQ.

THE DEVIATION INTELLIGENCE QUOTIENT

One of the most important innovations inthe standardization of the present Scale is thatIQ's are obtained by comparing each subject'stest performance not with a composite agegroup but exclusively with the scores earnedby individuals in a single (that is, his or herown) age group.2 With one stroke, the devia-tion IQ method cuts away much of the under-brush which has encumbered the problem ofthe variability of the individual's IQ. By keep-ing the standard deviation of IQ's identicalfrom year to year, a child's obtained IQ doesnot vary unless his actual test performance ascompared with his peers varies; if the stand-ard deviations were not made identical, achild's obtained IQ might vary considerablyfrom year to year, even though his relative abil-ity remained constant. Apart from test unre-liabilities, IQ's obtained by successive retests

2The deviation IQ concept has been similarly em-ployed in some group tests, notably the Otis Testsand Pintner General Ability Tests.

with the WISC automatically give the sub-ject's relative position in the age group to whichhe belongs at each time of testing. If anychanges are observed they may be ascribed tochanges in the subject and not in the structureof the test nor its standardization, since in IQunits the standard deviations as well as themeans of all age groups are identical. It is nolonger a matter of discovering how manychildren test above or below a given IQ in thepopulation, since the deviation IQ is by defi-nition dependent on the normal distribution ofthe test scores.

Each person tested is assigned an IQ which,at his age, represents his relative intelligencerating. This IQ, and all others similarly ob-tained, are deviation IQ's since they indicatethe amount by which a subject deviates aboveor below the average performance of individu-als of his own age group. The IQ of 100 onthe WISC is set equal to the mean total scorefor each age, and the standard deviation is setequal to 15 IQ points. In terms of percentilelimits, the highest one per cent will have IQ'sof 135 and above, and the lowest one per centIQ's of 65 and below. The middle fifty percent of children at each age will have IQ'sfrom 90 to 110.

The IQ tables were constructed as follows:For each age the five Verbal scaled scores foreach subject were summed and a mean andstandard deviation of such sums computed.These sums were transformed into a distribu-tion of IQ's with a mean of 100 and a stand-ard deviation of 15. The same process was fol-lowed for translating the sums of the five Per-formance scaled scores to an IQ scale with amean of 100 and an SD of 15. The IQ'sbased on a Full Scale Score of ten tests weresimilarly determined.

One set of three IQ tables (V, P, and FS)suffices for all ages since the process of scalingeach test for each age resulted in similar meansand sigmas of the sums of five or ten tests atall ages. This was not a fortuitous result, butis one that should be expected because each ofthe tests was standardized for each age so thatthe raw scores are converted into scaled scoreswith a mean of 10 and a standard deviation of3.

Page 8: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

106 H. SEASHORE, A. WESMAN, AND J. DOPPELT

INTELLIGENCE QUOTIENTS AND AGEHaving carried on all the standardization

processes described above, a test of the statisti-cal transformations from the original responsesof the child to the final IQ's is whether themean IQ's and their standard deviations forall ages approximate 100 and 15, respectively,when each subject's scores are now convertedto IQ's. Table 9 shows the data for boys andgirls and for the three Scales.

TABLE 9THE MEAN AND S D OF IQ's ON THE THREE SCALES

BY AGE AND SEX100 BOYS AND 100 GIRLS AT EACH AGE

PERFORM- FULLVERBAL ANCE SCALE

Age* Mean SD Mean SD Mean SD

Boys5 99.5 13.8 98.6 16.9 99.0 16.46 100.6 14.8 98.9 16.0 99.7 16.77 99.6 14.2 100.0 16.2 99.8 14.68 101.7 16.6 102.1 16.4 102.1 16.59 99.2 16.2 99.2 16.9 99.1 17.1

10 102.1 15.2 101.7 15,3 102.1 16.511 102.0 16.4 99.8 16.4 101.1 16.112 102.4 16.7 101.5 16.8 102.1 18.513 101.9 14.7 100.6 15.2 101.4 14.114 101.6 16.6 100.7 16.1 101.3 15.915 102.4 13.5 100.4 14.6 101.6 13.7All 101.2 15.S 100.3 16.6 100.8 15.6

Girl»5 99.8 13.4 101.0 13.9 100.4 13.26 100.6 13.9 101.4 13,4 101.0 13.67 100.6 11.7 100.9 12.0 100.7 11.48 97.8 15.8 97.9 15.8 97.6 16.89 100.4 13.0 100.4 14.0 100.4 18.1

10 98,4 16.7 98.4 13.8 98.2 15.311 97.9 14.5 99.6 13.8 98.6 18.812 97.5 14.7 100.0 14.4 98.5 14.B13 98.3 16.2 98.7 15.8 98.3 15.914 97.6 14.3 100.2 16.3 98.7 14.415 97.3 16.4 97.9 15.4 97.4 16.5All 98.7 14.7 99.7 14.4 99.1 14.4

Soya and Girls(N<=2200) 100.0 15.1 100.0 16.0 100.0 15.0

*Read these ages as 5%, 6%, etc.

The bottom line of values shows that for allcases — 2200 in all — the requirement is exactlymet. However, for boys and girls separatelyand for different ages, small discrepancies oc-cur. The age-to-age discrepancies are caused inpart by the fact that to secure one IQ tablefor all ages small discrepancies in means andstandard deviations of scaled scores for theeleven ages were eliminated by averaging. It

was assumed that these discrepancies were dueto sampling and that smoothing of the data inmaking scaled scores and the IQ tables wasjustified.

However, the sex differences are not so easyto explain. Scaled score and IQ tables weremade by treating boys and girls as members ofone sample. The mean IQ's, however, showthat boys in the standardization sample gen-erally were slightly superior to girls. The su-periority is primarily in the older ages, andthe differences are small. On the Verbal Scale,the boys excel the girls by more than threepoints at ages 8, and 10 through 15. On thePerformance Scale, the difference favors theboys by more than three points at ages 8 and10, and the girls are ahead at ages 5, 6, 7, and9. On the Full Scale, boys have higher IQ'sthan girls by 2.5 to 4.5 points at 7 ages, whilegirls are ahead by smaller amounts at fourages.

How shall one interpret these sex differ-ences ? Three explanations come to mind :

A, The tests are fair to both boys and

TABLE 10DISTRIBUTION OF IQ's OF RURAL AND

URBAN CHILDREN

PERFORM- FULLVERBAL ANCE SCALE

IQ Rural Urban Rural Urban Rural Urban

160-154 1145-149 3 1 2140-144 7 2 2186-189 2 8 3 6 3 7130-134 9 16 2 12 2 16126-129 13 40 14 46 10 41120-124 22 78 29 82 27 69115-119 37 116 62 101 46 117110-114 67 148 9,2 180 68 167

100-104 110 211 187 193 122 19495- 99 130 170 93 160 119 19490- 94 116 150 103 172 119 13485- 89 94 96 89 108 68 8780- 84 69 67 35 38 66 5575- 79 81 22 50 42 39 2870- 74 15 7 23 14 25 1265-69 11 1 6 10 6 660-64 6 2 8 3 4 155-59 1 1 1 2 1

N 818 1827 818 1327 818 1827Mean 97.3 103.3 98.5 102.5 97.6 108.2SD 18.5 13.4 18.8 18.6 13.5 12.9

Page 9: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 107

girls, and boys actually do excel girls, especial-ly at the later ages.

B. Boys and girls are the same in mentalability, but the chosen test items turned out tobe slightly biased in favor of the boys.

C. Again, assuming that general abilityis not sex differentiated, the sampling of boyswas somehow chosen with a slight bias.

The data at hand do not permit a resolutionof these three choices. The safest assumption isthat factors described in (B) and (C) are in-volved. Terman and Merrill [1,3] found thesame situation in their 1937 Revision of theStanford-Binet examination, and likewise couldfind no definitive answer from their data.

All in all, the preliminary studies leading toinclusion of test items and the sampling itselfwere fortunate enough to result in mean IQ'sof boys and girls which are essentially equal.

For all practical purposes the clinical examinercan ignore sex differences. A difference in meanscores of three points, for example, is really aplus and minus difference of 1 l/2 points fromthe actual norms based on both sexes.

INTELLIGENCE QUOTIENTS OF RURAL ANDURBAN CHILDREN

Table 10 distributes the IQ's of urban andrural children for all ages, the 55 knownfeebleminded cases being excluded. As has beenfound in many researches, urban children scorehigher on mental tests. The Full Scale differ-ence in the standardization sample is 5.6 points.The Verbal Scale difference is 6 points, andthe Performance Scale difference is 4 points.Terman and Merrill report an urban-ruraldifferential of 6.5 points for the 1937 Revis-ion of Stanford-Binet.

TABLE 11DISTRIBUTION OF VERBAL IQ's FOR EACH OCCUPATIONAL GROUP,

FOR THE FEEBLEMINDED GROUP, AND FOR ALL CASES

VerbalIQ

150-154.145-149

140-144.135-139130-134.125-129120-124.115-119110-114.105-109100-104.95- 9990- 94.85- 8980- 8475- 7970- 74.65- 6960- 6455- 5950- 5445- 4944 and below

NMeanSD

1

1436

191920202029178621

1

176110.914.0

2

1

349

112019212635292412242

22296.814.6

3

11444

232441394025211793

256105.9

12.9

Father's Occupation*4 5 6 7

1267

1426424448452411622

280105.2

11.7

699

3039565957553921

741

1

393100.812.6

7152027445762603123

9422

36398.812.2

1

1565

17191623111223

1

12297.612.8

8

1

23

121532464439423716

7421

30394.612.8

9

1

3313281421

1

30100.6

16.1

FM

2898

14941

5559.68.1

All

137

102553

100152210274321300266190136

5530211616941

2200100.015.1

•See Code in Table 4.

Page 10: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

108 H. SEASHORE, A. WESMAN, AND J. DOPPELT

INTELLIGENCE QUOTIENTS AND FATHER SOCCUPATION

Distributions of Verbal, Performance, andFull Scale IQ's are given in Tables 11, 12, and13 for each of the nine occupational categories,and for the known feebleminded cases. Particu-lar interest attaches to the differences in meanscores of these various groups. Table 14 ap-proximates the data for the full scale.

These differences in mean IQ's for occupa-tional groups are considerable but not as greatas those reported by Terman and Merrill forthe 1937 Revision. The categories are not ex-actly comparable. Group A corresponds fairlyclosely to Terman and Merrill's groups I andII, for which they give eight mean IQ's fordifferent age groupings, the median of these be-ing about 115 as compared with 110 from theWechsler data. For several age groupings in

their category IV, rural owners, the median ofthe mean IQ's is about 94 as compared with97 for Group E. Their group VII and theWechsler group F are comparable, and yieldmean IQ's of about 97 (median for mean IQ'sof four ages) and 94, respectively.

For the lower socioeconomic groups, the twotests yield similar mean IQ's of the order of 95.The somewhat higher mean IQ of Termanand Merrill's higher sociometric groups, 115vs. 110, may be accounted for, in part, by thegreater verbal loading in the Stanford-Binettests. For WI&G, it is noted in Tables 11 and12 that socioeconomic differences are greateron the Verbal Scale than on the PerformanceScale.

Whatever other social implications one mayconsider, these mean differences betweengroups should not be allowed to overshadow

TABLE 12DISTRIBUTION OF PERFORMANCE IQ's FOR EACH OCCUPATIONAL GROUP,

FOR THE FEEBLEMINDED GROUP, AND FOR ALL CASES

Performance

IQ

150-154145-149140-144135-139130-134125-129120-124115-119

110-114.105-109100-10495- 9990- 9485- 8980- 84.75- 7970- 74. _....65- 6960- 6455- 5950- 54 . . .45- 4944 and below

NMean

SD

1

11

3161322312120181581411

176107.8

13.4

2

112

11122621412224231117811

22298.613.9

3

438

17264043323024156611

256105.3

12.9

4

1135

19264838453429188

311

280104.3

12.2

Father's Occupation5 6 7

13

1017276045694448361410423

393101.6

13.0

11

1221203036574757371617821

36399.513.5

4377

1217152710

5834

12296.913.7

8

29

112422434247471128

8351

30394.913.3

9

1

26261431211

3098.313.6

FM

1

21359

117763

5561.612.2

All

128

1459

1111532722403302542751997495422522

8763

2200100.0

15.0

Page 11: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

STANDARDIZATION OF WECHSLER SCALE FOR CHILDREN 109

the fact of great overlap in the distributions ofIQ's for the various occupational groups.

THE FEEBLEMINDED

Wechsler classes IQ's under 70 as evidenceof feeblemindedness. On the basis of a mean of100 and a standard deviation of 15, 2.2 percent of cases should have IQ's below 70. Thisis a statistical definition; it says that arbitrarily2.2 per cent of the cases are feebleminded. Theclinical and social significance of feebleminded-ness is not defined by these numbers exceptthat if this percentage is very far away fromthe proportion who are actually classified asfeebleminded in clinical practice, the statisticalconcept will have to be changed. Roughly, thetest author considers that about 3 per cent ofthe population could well be classified asfeebleminded by a test; this seems to be reason-able in the light of actual practice.

In the standardization it was decided to letabout 2.5 per cent of the sample be made up ofknown feebleminded cases. The resultant dis-tribution of IQ's of these 55 cases is shown inthe columns headed FM in Tables 11, 12, and13. Some of these 55 institutionalized andschool-identified f e e b l e m i n d e d cases testedabove 70. On the V and P Scales, 10 and 12cases, respectively, tested over 70; but whenV and P scores were made into Full Scalescores for these subjects, only 4 showed up ashaving IQ's over 70 on the Full Scale.

Some children in the general sample testedbelow the level set for feeblemindedness; inthe Full Scale the number is 19, which, addedto the 51 cases (55—4), equals 70 cases, or3.2 per cent of the total sample.

In making the IQ scales on a deviation basisit is possible to assign IQ's as low as the sum

TABLE 13DISTRIBUTION OF FULL SCALE IQ's FOR EACH OCCUPATIONAL GROUP,

FOR THE FEEBLEMINDED GROUP, AND FOR ALL CASES

Full ScaleIQ

150-154...145-149140-144135-139130-134125-129120-124115-119....110-114105-109100-10495- 9990- 94 ...85- 8980- 8475- 7970- 7465- 6960- 6455- 5950- 5445- 4944 and below

NMeanSD

1

1

27

15232326182-V1886211

1

176110.313.3

2

212817142427283421231272

22297.414.0

3

12412162643403937119106

256106.212.4

4

1129

15273561473529962

1

280105.211.1

Father's Occupation5 6 7

135

143048636159393123104

11

393101.312.4

1

7112128466258583216137

3

36399.112.2

1

18

1214112222111217

12297.012.5

8

1

168

1628435149342621107

2

30394.212.8

9

1

233325323111

3099.515.2

FM

12186141166

5556.69.5

All

221018519616322529731631325315512269381911171166

2200100.015.0

Page 12: THE STANDARDIZATION OF THE WECHSLER INTELLIGENCE SCALE FOR CHILDREN

110 H. SEASHORE., A. WESMAN, AND J. DOPPELT

TABLE 14APPROXIMATE WECHSLER IQ's FOR OCCUPATIONAL

CATEGORIESApproximate

Wechsler IQ's

A. (1) Professional and semi-professional workers 110

B. (3) Proprietors, managers andofficials, and

(4) Clerical, sales andkindred workers 105 +

C. (5) Craftsmen, foremen andkindred workers, and

(6) Operatives and kindredworkers M................................ 100

D. (7) Domestic, protective andother services workers 97

E. (2) Farmers and farmmanagers 97

F. (8) Farm laborers and fore-men, and laborers 94

of the scaled scores obtainable. It was felt thatIQ's under 45 would not be discriminativelymeaningful, so the IQ tables stop at that point.Persons who have scaled scores yielding IQ'sbelow 45 can be recorded as "44 or below."

THE SUPERIOR

No attempt was made to isolate a uniquegroup of very superior children as was donewith the feebleminded. It was assumed thatthe superior children were in the school systemsand would enter the sample in proper samplingproportions. On the Full Scale, 1.5 per centtested above an IQ of 130, whereas 2.2 percent were expected. On the Verbal Scale the

percentage was 2.1 per cent, and for the Per-formance Scale, 1.1 per cent. In the publishedIQ tables the highest IQ assigned is 156, butchildren whose scaled scores are higher thanthose required to attain this IQ can be recordedas being 156 or above. Differentiation abovethis point probably is not necessary.

To examiners who have been accustomed tosecure IQ's on the order of 20 to 25 or 170 to180, lack of very low and very high IQ's onWISC may at first be a little disturbing. Theyshould be reminded that the range of IQ's on adeviation scale can be quite arbitrary. For ex-ample, if the mean were set at 100 and thestandard deviation at 20 (instead of 15), low-er and higher IQ's would be secured arbitrari-ly. The reason for setting the standard devia-tion at 15 is that it approximates the empiricalstandard deviation of about 16 secured by Ter-man and Merrill by an age-scale method. Withstandard deviations so similar, WISC will ap-proximate in meaning (as far as size of thenumber is concerned) the IQ's secured by theStanford-Binet Revision.Received December 6, 1949.Early publication.

REFERENCES1. McNEMAR, Q. The revision of the Stanford-

Binet scale. Boston: Houghton Mifflin, 1942.2. McNEMAR, Q. Psychological statistics. New York:

Wiley, 1949.3. TERMAN, L. M., AND MERRILL, MAUDE A. Meas-

uring intelligence. Boston: Houghton Mifflin,1937.

4. WECHSLER, D. The measurement of adult intel-ligence. (3rd Ed.) Baltimore: Williams & Wil-kins, 1944.