IQ Tests: Their History, Use, Validity, and Intelligent ......CHAPTER 1 IQ TESTS: THEIR HISTORY,...

23
C HAPTER 1 1 C HAPTER 1 IQ Tests: Their History, Use, Validity, and Intelligent Interpretation The field of intelligence, particularly of adoles- cent and adult mental development, has domi- nated the psychological literature for decades, and now encompasses a diversity of domains within cognitive psychology, clinical psychology, psychobiology, behavioral genetics, education, school psychology, sociology, neuropsychology, and everyday life. Excellent handbooks are avail- able with chapters written by experts in many as- pects of intellectual theory, measurement, and development (e.g., Flanagan, Genshaft, & Harri- son, 1997; Groth-Marnat, 2000), and even these texts cover only a portion of the territory and quickly become outdated. Consequently, in writ- ing this text on the assessment of adolescent and adult intelligence, we have had to make several decisions about which areas to include and how thoroughly to cover each topic. First, this book focuses on the clinical assess- ment of intelligence, and every topic must bear, either directly or indirectly, on the clinical aspect of mental measurement. Because clinical assess- ment within the fields of neuropsychology, special education, and clinical, school, and counseling psychology involves individual evaluations, re- search on group-administered tests is subordi- nated to the more pertinent research on individual intelligence tests. The 1990 version of this text covered group-administered intelligence tests to some extent. However, the adolescent and adult assessment scene has changed during this past decade, with clinicians having options beyond Wechsler’s tests. Whereas the Wechsler Adult In- telligence Scale—Third Edition (WAIS-III; Psy- chological Corporation, 1997; Wechsler, 1997) is still the most used test, and is clearly the featured instrument in this revised text, the availability of a variety of new in-depth and brief intelligence tests, and a proliferation of research on these in- struments, has impelled us to focus on individually administered intelligence tests. For example, the monumental efforts of Schaie (1958, 1983b, 1994) and his colleagues (Hertzog & Schaie, 1988; Schaie & Labouvie- Vief, 1974; Schaie & Strother, 1968; Schaie & Willis, 1993) to understand the development of ch01.fm Page 1 Tuesday, September 25, 2001 2:56 PM COPYRIGHTED MATERIAL

Transcript of IQ Tests: Their History, Use, Validity, and Intelligent ......CHAPTER 1 IQ TESTS: THEIR HISTORY,...

  • C H A P T E R 1

    1

    C H A P T E R 1

    IQ Tests: Their History,Use, Validity, and

    Intelligent Interpretation

    The field of intelligence, particularly of adoles-cent and adult mental development, has domi-nated the psychological literature for decades,and now encompasses a diversity of domainswithin cognitive psychology, clinical psychology,psychobiology, behavioral genetics, education,school psychology, sociology, neuropsychology,and everyday life. Excellent handbooks are avail-able with chapters written by experts in many as-pects of intellectual theory, measurement, anddevelopment (e.g., Flanagan, Genshaft, & Harri-son, 1997; Groth-Marnat, 2000), and even thesetexts cover only a portion of the territory andquickly become outdated. Consequently, in writ-ing this text on the assessment of adolescent andadult intelligence, we have had to make severaldecisions about which areas to include and howthoroughly to cover each topic.

    First, this book focuses on the clinical assess-ment of intelligence, and every topic must bear,either directly or indirectly, on the clinical aspectof mental measurement. Because clinical assess-ment within the fields of neuropsychology, special

    education, and clinical, school, and counselingpsychology involves individual evaluations, re-search on group-administered tests is subordi-nated to the more pertinent research on individualintelligence tests. The 1990 version of this textcovered group-administered intelligence tests tosome extent. However, the adolescent and adultassessment scene has changed during this pastdecade, with clinicians having options beyondWechsler’s tests. Whereas the Wechsler Adult In-telligence Scale—Third Edition (WAIS-III; Psy-chological Corporation, 1997; Wechsler, 1997) isstill the most used test, and is clearly the featuredinstrument in this revised text, the availability of avariety of new in-depth and brief intelligencetests, and a proliferation of research on these in-struments, has impelled us to focus on individuallyadministered intelligence tests.

    For example, the monumental efforts ofSchaie (1958, 1983b, 1994) and his colleagues(Hertzog & Schaie, 1988; Schaie & Labouvie-Vief, 1974; Schaie & Strother, 1968; Schaie &Willis, 1993) to understand the development of

    ch01.fm Page 1 Tuesday, September 25, 2001 2:56 PM

    COPY

    RIGH

    TED

    MAT

    ERIA

    L

  • 2 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    adult intelligence have been based on the group-administered Primary Mental Abilities Test. Thekey findings from these innovative cross-sequentialstudies are of interest to psychology in general,but have limited applicability to the work of clin-ical and neuropsychological practitioners. Con-sequently, investigations by Schaie will only bediscussed in the context of aging studies on clin-ical instruments (e.g., Kaufman, 2000b, 2001;Kaufman & Horn, 1996), especially the WAIS-III,WAIS-R (Wechsler, 1981) and Kaufman Adoles-cent and Adult Intelligence Test (KAIT; Kauf-man & Kaufman, 1993).

    Consistent with the focus on clinical tests of in-telligence, we have also eliminated sections andchapters from the first edition on clinical toolsthat are only tangentially related to IQ assess-ment, most notably neuropsychological instru-ments, adaptive behavior surveys, and individualachievement tests.

    OUTLINE OF THE BOOK

    Assessing Adolescent and Adult Intelligence (2nd ed.)has five parts:

    I. Introduction to the Assessment of Adoles-cent and Adult Intelligence (Chapters 1–3)

    II. Individual Differences on Age, Socioeco-nomic Status, and Other Key Variables(Chapters 4–5)

    III. Integration and Application of WAIS-IIIResearch (Chapters 6–9)

    IV. Interpretation of the WAIS-III Profile: IQs,Factor Indexes, and Subtest Scaled Scores(Chapters 10–12)

    V. Additional Measures of Adolescent andAdult IQ (Chapters 13–15)

    Part I includes: Chapter 1, which discussespertinent historical information, issues regardingvalidation of the IQ construct, and our philoso-phy of intelligent testing; Chapter 2, which dis-cusses pressing issues and challenges to the IQconcept (e.g., heritability and malleability of the

    IQ); and Chapter 3, which provides the rationalefor the WAIS-III subtests for adolescents andadults and traces the empirical and logical conti-nuity from the Wechsler-Bellevue to the WAISto the WAIS-R and to the WAIS-III.

    Part II presents research on individual differ-ences in intelligence associated with pertinentbackground variables on the WAIS-III and otherinstruments, notably gender, ethnicity, socioeco-nomic status, and urban–rural residence (all treatedin Chapter 4), and aging across the adult lifespan(Chapter 5).

    Parts III and IV are devoted to the WAIS-IIIand, occasionally, its predecessors (e.g., WAIS-R)or “alternate-form” at age 16 (WISC-III). In PartIII, the focus is on research, delving into topicssuch as administration and scoring (Chapter 6),factor analysis (Chapter 7), and Verbal Perfor-mance (V-P) IQ differences, especially as they per-tain to lateralized brain lesions (Chapter 8) andother clinical disorders (Chapter 9). The threechapters of Part IV (Chapters 10, 11, and 12) are alldevoted to an empirical and clinical approach tointerpretation of the WAIS-III multiscore profile.

    Part V is composed of three chapters; each fo-cuses exclusively on additional (non-Wechsler)measures for adolescent and adult assessmentand integrates them with the WAIS-III: theKAIT (Chapter 13), the Woodcock-Johnson—Third Edition or WJ III (Chapter 14, authoredby McGrew, Woodcock, and Ford), and a varietyof brief tests of intelligence (Chapter 15). Thetests discussed in the latter chapter, for example,the Peabody Picture Vocabulary Test—ThirdEdition (PPVT-III), the Kaufman Brief Intelli-gence Test (K-BIT), and the Wechsler Abbrevi-ated Scale of Intelligence (WASI), may be usedas supplements to the WAIS-III, KAIT, or WJ III,or may be used instead of comprehensive intelli-gence tests in certain circumstances (e.g., screen-ing or research purposes).

    The discussion of non-Wechsler tests in PartIV is essential to round out the cognitive assess-ment scene, but the WAIS-III, like the WAIS-R,WAIS, and Wechsler-Bellevue before it, remainsthe key tool for clinical and neuropsychologicalevaluation of adolescents and adults and, hence,

    ch01.fm Page 2 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 3

    the focus of all sections of the book. The chap-ters on clinical applications of intelligence tests,along with the previous parts of the book, placethe focus of this text squarely on the WAIS-III.

    Wechsler’s ScalesEven a casual observer of the clinical or neuro-psychological assessment scene is aware thatWechsler’s scales are uncontested as the primarycognitive measures of adolescent and adult intelli-gence. Individuals in their teens and adults of allages are invariably administered the Wechsler In-telligence Scale for Children—Third Edition(WISC-III; Wechsler, 1991) or the WAIS-IIIwhen they are referred to a competent profes-sional for a thorough assessment of their intel-lectual abilities, usually as part of a clinical,vocational, neuropsychological, or psychoeduca-tional evaluation. The WISC-III is used for ado-lescents as old as 16 years, while the WAIS-III isused for individuals aged 16 to 89. Therefore,they overlap at age 16, giving clinicians a choiceof Wechsler test for that age group.

    Using the WISC-III as a clinical and psycho-metric tool has been discussed elsewhere in acomprehensive text (Kaufman, 1994a). For practi-cal purposes, then, this book is primarily devotedto the WAIS-III, child of the WAIS-R (Wechsler,1981), grandchild of the WAIS (Wechsler, 1955),and great-grandchild of the Wechsler-BellevueForm I (Wechsler, 1939).

    Clinical Relevance of TheoryTo be included in this book in any depth, a topicneeds to contribute to a psychologist’s understand-ing of intelligence in the clinical arena, not in thelaboratory. For example, the Cattell-Horn-Carroll(CHC; McGrew & Flanagan, 1998) theory—anamalgam of Horn’s (1989) expansion of Horn-Cattell Gf-Gc theory and Carroll’s (1993, 1997)model of intelligence—is treated throughout thebook because it is instrumental in explainingchanges in verbal and nonverbal abilities withadvancing age, and it (or Horn-Cattell theory)underlies three tests of adolescent and adult in-

    telligence: the Woodcock Johnson Psycho-Educa-tional Battery—Third Edition (WJ III; Woodcock,McGrew, & Mather, 2000), the Stanford-BinetIntelligence Scale, Form IV (Thorndike, Hagen,& Sattler, 1986a), and the KAIT (Kaufman &Kaufman, 1993). In contrast, Sternberg’s (1985)three-pronged triarchic theory of intelligence,though popular and widely discussed, is not em-phasized because of its limited application toclinical assessment and the interpretation of theWAIS-III and other individual intelligence tests.Currently the Sternberg Triarchic Abilities Test(Sternberg, 1993), a group-administered mea-sure, is available as an unpublished research in-strument available from its author. However, if itis ever adapted as an individually administered,commercially published, standardized measurethat translates laboratory principles to the domainof the clinical psychologist, neuroclinician, andpsychoeducational diagnostician, the theory maybecome even more popular.

    In addition, other theories of intelligencesuch as Gardner’s (1993a, 1993b) theory of mul-tiple intelligences—which defines intelligence asthe ability to solve problems, or to create prod-ucts, that are valued within one or more culturalsettings—is also not emphasized in this book.The theory of multiple intelligences calls formeasuring intelligences by asking individuals tosolve problems in the contexts in which they nat-urally occur. Although the multiple intelligencestheory has attracted much attention in the fieldsof cognition and education (Kornhaber & Kre-chevsky, 1995), thus far its practical application toclinical assessment and the interpretation of theWAIS-III and other major standardized individ-ual intelligence tests is limited.

    A SHORT HISTORY OF IQ TESTS

    The history of intellectual assessment is largely ahistory of the measurement of the intelligence ofchildren or retarded adults. Sir Francis Galton(1869, 1883) studied adults and was interested in

    ch01.fm Page 3 Tuesday, September 25, 2001 2:56 PM

  • 4 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    giftedness when he developed what is often con-sidered the first comprehensive individual test ofintelligence (Kaufman, 2000a). But despite Gal-ton’s role as the father of the testing movement(Shouksmith, 1970), he did not succeed in con-structing a true intelligence test. His measures ofsimple reaction time, strength of squeeze, orkeenness of sight proved to assess sensory andmotor abilities, skills that relate poorly to mentalability, and that are far removed from the type oftasks that constitute contemporary intelligencetests.

    The Binet-Simon ScalesAlfred Binet and his colleagues (Binet & Henri,1895; Binet & Simon, 1905, 1908) developed thetasks that survive to the present day in most testsof intelligence for children and adults. Binet(1890a, 1890b) mainly studied children; begin-ning with systematic developmental observationsof his two young daughters, Madeleine and Alice,he concluded that simple tasks like those used byGalton did not discriminate between children andadults. In 1904, the Minister of Public Instructionin Paris appointed Binet to a committee to find away to distinguish normal from retarded children.But 15 years of qualitative and quantitative inves-tigation of individual differences in children—along with considerable theorizing about mentalorganization and the development of a specific setof complex, high-level tests to investigate thesedifferences—preceded the “sudden” emergenceof the landmark 1905 Binet-Simon intelligencescale (Murphy, 1968).

    The 1908 scale was the first to include agelevels, spanning the range from III to XIII. Thisimportant modification stemmed from Binet andSimon’s unexpected discovery that their 1905scale was useful for much more than classifying achild at one of the three levels of retardation:moron, imbecile, idiot (Matarazzo, 1972). As-sessment of older adolescents and adults, how-ever, was not built into the Binet-Simon systemuntil the 1911 revision. That scale was extendedto age level XV and included five ungraded adult

    tests (Kite, 1916). This extension was not con-ducted with the rigor that characterized the con-struction of tests for children, and the primaryapplications of the scale were for use with school-age children (Binet, 1911).

    Measuring the intelligence of adults, exceptthose known to be mentally retarded, was almostan afterthought. But the increased applicability ofthe Binet-Simon tests for various child-assessmentpurposes dawned on Binet just prior to his un-timely death in 1911: “By 1911 Binet began toforesee numerous uses for his method in childdevelopment, in education, in medicine, and inlongitudinal studies predicting different occupa-tional histories for children of different intellec-tual potential” (Matarazzo, 1972, p. 42).

    Terman’s Stanford-BinetLewis Terman was one of several people in theUnited States who translated and adapted theBinet-Simon scale for use in the United States,publishing a “tentative” revision (Terman &Childs, 1912) 4 years before releasing his painstak-ingly developed and carefully standardized Stan-ford Revision and Extension of the Binet-SimonIntelligence Scale (Terman, 1916). This land-mark test, soon known simply as the Stanford-Binet, squashed competing tests developed earlierby Goddard, Kuhlmann, Wallin, and Yerkes. Ter-man’s success was undoubtedly due in part toheeding the advice of practitioners whose de-mand “for more and more accurate diagnoses...raised the whole question of the accurate plac-ing of tests in the scale and the accurate evalua-tion of the responses made by the child” (Pintner& Patterson, 1925, p. 11).

    But, like Binet, Terman (1916) saw intelligencetests useful primarily for the detection of mentaldeficiency or superiority in children and for theidentification of “feeblemindedness” in adults.He cited numerous studies of delinquent adoles-cents and adult criminals, all of which pointed tothe high percentage of mentally deficient juveniledelinquents, prisoners, or prostitutes, and con-cluded that “there is no investigator who denies

    ch01.fm Page 4 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 5

    the fearful role played by mental deficiency in theproduction of vice, crime, and delinquency” (p. 9).Terman also saw the potential for using intelli-gence tests with adults for determining “vocationalfitness,” but, again, he emphasized employing “apsychologist...to weed out the unfit” or to “deter-mine the minimum ‘intelligence quotient’ neces-sary for success in each leading occupation”(p. 17).

    Perhaps because of this emphasis on the as-sessment of children or concern with the lowerend of the intelligence distribution, Terman(1916) did not use a rigorous methodology forconstructing his adult-level tasks. Tests belowthe 14-year level were administered to a fairlyrepresentative sample of about 1,000 childrenand early adolescents. To extend the scale abovethat level, data were obtained from 30 business-men, 50 high school students, 150 adolescentdelinquents, and 150 migrating unemployedmen. Based on a frequency distribution of themental ages of a mere 62 adults (the 30 business-men and 32 of the high school students aboveage 16), Terman partitioned the graph into thefollowing MA categories: 13–15 (inferior adults),15–17 (average adults), and above 17 (superioradults).

    The World War I TestsThe infant field of adult assessment grew rapidlywith the onset of World War I, particularly afterU.S. entry into the war in 1917 (Anastasi & Ur-bina, 1997; Vane & Motta, 1984). Psychologistssaw with increasing clarity the applications of in-telligence tests for selecting officers and placingenlisted men in different types of service, apartfrom their generation-old use for identifying thementally unfit. Under the leadership of RobertYerkes and the American Psychological Associa-tion, the most innovative psychologists of theday helped translate Binet’s tests to a group for-mat. Arthur Otis, Terman’s student, was instru-mental in leading the creative team thatdeveloped the Army Alpha, essentially a group-administered Stanford-Binet, and the Army

    Beta, a novel group test composed of nonverbaltasks.

    Yerkes (1917) opposed Binet’s age-scale ap-proach and favored a point-scale methodology,one that advocates selection of tests of specified,important functions rather than a set of tasksthat fluctuates greatly with age level and devel-opmental stage. The Army group tests reflect ablend of Yerkes’s point-scale approach and Bi-net’s notions of the kind of skills that should bemeasured when assessing mental ability. TheArmy Alpha included the Binet-like tests ofDirections or Commands, Practical Judgment,Arithmetical Problems, Synonym-Antonym, Dis-sarranged Sentences, Analogies, and Information.Even the Army Beta had subtests resemblingStanford-Binet tasks: Maze, Cube Analysis, Picto-rial Completion, and Geometrical Construction.The Beta also included novel measures like DigitSymbol, Number Checking, and X-O Series(Yoakum & Yerkes, 1920).

    Never before or since have tests been normedand validated on samples so large; 1,726,966 menwere tested (Vane & Motta, 1984)! Point-scoreson the Army Alpha or Army Beta were convertedto letter grades ranging from A to D- (the Betawas given only to illiterate and non-English-speaking candidates). Validity was demonstratedby examining the percent of A’s obtained by a va-riety of Army ranks, for example, recruits (7.4%),corporals (16.1%), sergeants (24.0%), and majors(64.4%). In perhaps the first empirical demon-stration of the Peter Principle in action, secondlieutenants (59.4% A’s) outperformed their directsuperiors—first lieutenants (51.7%) and captains(53.4%)—while those with ranks above majorperformed slightly worse than majors (Yoakum& Yerkes, 1920, Table 1). Can there be any morecompelling affirmation of the validity of the Armyintelligence tests? Another intelligence scale wasdeveloped during the war, one that became analternative for those who could not be tested val-idly by either the Alpha or Beta. This was theArmy Performance Scale Examination, composedof tasks that would become the tools-of-trade forclinical psychologists, school psychologists, and

    ch01.fm Page 5 Tuesday, September 25, 2001 2:56 PM

  • 6 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    neuropsychologists into the twenty-first century:Picture Completion, Picture Arrangement, DigitSymbol, and Manikin and Feature Profile (Ob-ject Assembly). Except for Block Design (de-veloped by Kohs in 1923), Wechsler’s influentialPerformance Scale was added to the Army bat-tery, “[t]o prove conclusively that a man wasweakminded and not merely indifferent or ma-lingering” (Yoakum & Yerkes, 1920, p. 10).

    Wechsler’s CreativityDavid Wechsler assembled a test battery in themid-1930s that comprised subtests developedprimarily by Binet and World War I psycholo-gists. His Verbal Scale was essentially a Yerkespoint-scale adaptation of Stanford-Binet tasks; hisPerformance Scale, like other similar nonverbalbatteries of the 1920s and 1930s (Cornell & Coxe,1934; Pintner & Patterson, 1925), was a near rep-lica of the tasks and items making up the indi-vidually administered Army Performance ScaleExamination.

    In essence, Wechsler took advantage of tasksdeveloped by others for nonclinical purposes todevelop a clinical test battery. He paired verbaltests that were fine-tuned to discriminate amongchildren of different ages with nonverbal teststhat were created for adult males who had flunkedboth the Alpha and Beta exams—nonverbal teststhat were intended to distinguish between thenonmotivated and the hopelessly deficient. LikeTerman, Wechsler had the same access to theavailable tests as did other psychologists; likeTerman and Binet before him, Wechsler suc-ceeded because he was a visionary, a man able toanticipate the needs of practitioners in the field.

    While others hoped intelligence tests wouldbe psychometric tools to subdivide retarded in-dividuals into whatever number of categories wascurrently in vogue, Wechsler saw the tests as dy-namic clinical instruments. While others lookedconcretely at intelligence tests as predictors ofschool success or guides to occupational choice,Wechsler looked abstractly at the tests as a mir-

    ror to the hidden personality. With the GreatWar over, many psychologists returned to a fo-cus on IQ testing as a means of childhood assess-ment; Wechsler (1939), however, developed thefirst form of the Wechsler-Bellevue exclusivelyfor adolescents and adults.

    Most psychologists saw little need for nonver-bal tests when assessing English-speaking indi-viduals other than illiterates. How could it beworth 2 or 3 minutes to administer a single puz-zle or block-design item when 10 or 15 verbalitems can be given in the same time? Some testdevelopers (e.g., Cornell & Coxe, 1934) felt thatPerformance scales might be useful for normal,English-speaking people to provide “more var-ied situations than are provided by verbal tests”(p. 9), and to “test the hypothesis that there is agroup factor underlying general concrete ability,which is of importance in the concept of generalintelligence” (p. 10).

    Wechsler was less inclined to wait a gen-eration for data to accumulate. He followed hisclinical instincts and not only advocated the ad-ministration of a standard battery of nonverbaltests to everyone but placed the PerformanceScale on an equal footing with the more re-spected Verbal Scale. Both scales would consti-tute a complete Wechsler-Bellevue battery, andeach would contribute equally to the overall in-telligence score.

    Wechsler also had the courage to challengethe Stanford-Binet monopoly, a boldness not un-like Binet’s when the French scientist created hisown forum (the journal L’Année Psychologique) tochallenge the preferred but simplistic Galtonsensorimotor approach to intelligence (Kauf-man, 2000a). Wechsler met the same type of re-sistance as Binet, who had had to wait until theFrench Ministry of Public Instruction “pub-lished” his Binet-Simon Scale. When Wechsler’sinitial efforts to find a publisher for his two-pronged intelligence test met failure, he had nocabinet minister to turn to, so he took mattersinto his own hands. With a small team of col-leagues, he standardized Form I of the Wechsler-Bellevue by himself. Realizing that stratification

    ch01.fm Page 6 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 7

    on socioeconomic background was more crucialthan obtaining regional representation, he man-aged to secure a well-stratified sample fromBrooklyn, New York.

    The Psychological Corporation agreed to pub-lish Wechsler’s battery once it had been stan-dardized, and the rest is history. Although analternative form of the Wechsler-Bellevue (Wech-sler, 1946) was no more successful than Termanand Merrill’s (1937) ill-fated Form M, a subse-quent downward extension of Form II of theWechsler-Bellevue (to cover the age range 5 to15 instead of 10 to 59) produced the wildly suc-cessful WISC (Wechsler, 1949). Although theWechsler scales did not initially surpass theStanford-Binet in popularity, serving an appren-ticeship to the master in the 1940s and 1950s,the WISC and the subsequent revision of theWechsler-Bellevue, Form I (WAIS; Wechsler,1955) triumphed in the 1960s. “With the in-creasing stress on the psychoeducational assess-ment of learning disabilities in the 1960s, and onneuropsychological evaluation in the 1970s, theVerbal-Performance (V-P) IQ discrepancies andsubtest profiles yielded by Wechsler’s scales werewaiting and ready to overtake the one-score Bi-net” (Kaufman, 1983b, p. 107).

    Irony runs throughout the history of testing.Galton developed statistics to study relationshipsbetween variables—statistics that proved to beforerunners of the coefficient of correlation,later perfected by his friend Karl Pearson (Du-Bois, 1970). The ultimate downfall of Galton’ssystem of testing can be traced directly to co-efficients of correlation, which were too low insome crucial (but, ironically, poorly designed)studies of the relationships among intellectualvariables (Sharp, 1898–99; Wissler, 1901). Simi-larly, Terman succeeded with the Stanford-Binetwhile the Goddard-Binet (Goddard, 1911), theHerring-Binet (Herring, 1922), and other Binet-Simon adaptations failed because he was sensi-tive to practitioners’ needs. He patiently with-held a final version of his Stanford revision untilhe was certain that each task was appropriatelyplaced at an age level consistent with the typical

    functioning of representative samples of U.S.children.

    Terman continued his careful test developmentand standardization techniques with the first re-vised version of the Stanford-Binet (Terman &Merrill, 1937). But 4 years after his death in 1956,his legacy was devalued when the next revision ofthe Stanford-Binet comprised a merger of FormsL and M, without a standardization of the newlyformed battery (Terman & Merrill, 1960). Thefollowing version saw a restandardization of theinstrument, but without a revision of the place-ment of tasks at each age level (Terman & Merrill,1973). Unfortunately for the Binet, the abilities ofchildren and adolescents had changed fairly dra-matically in the course of a generation, so the 5-year level of tasks (for example) was now passed bythe average 4 -year-old!

    Terman’s methods had been ignored by hissuccessors. The ironic outcome was that Wech-sler’s approach to assessment triumphed, at leastin part because the editions of the Stanford-Binet in the 1960s and 1970s were beset by thesame type of flaws as Terman’s competitors in the1910s. The newest Stanford-Binet (Thorndike,Hagen, & Sattler, 1986a, 1986b) attempted tocorrect these problems and even adopted Wech-sler’s multisubtest, multiscale format. However,these changes in the Fourth Edition of the Binetwere too little and too late to be much threat tothe popularity of the Wechsler scales, to offermuch contribution to the field of intelligencetesting, or to merit the linkage with the Binettradition.

    SURVEYS OF TESTUSAGE FOR ADULTS

    Surveys of test use in the United States have ap-peared increasingly in the literature in the pastdecade. These surveys are usually based on datafrom clinical agencies and hospitals (Lubin,Larsen, & Matarazzo, 1984; Petrowski & Keller,1989), school systems (Goh, Teslow, & Fuller,

    12

    ch01.fm Page 7 Tuesday, September 25, 2001 2:56 PM

  • 8 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    1981; Hutton, Dubes, & Muir, 1992; Wilson &Reschly, 1996), industry (Swenson & Lindgren,1952), military settings (Lubin, Larsen, Mat-arazzo, & Seever, 1986), forensic settings (Lees-Hayley, Smith, Williams, & Dunn, 1996), or pri-vate practitioners (Archer, Maruish, Imhof, &Piotrowski, 1991; Camara, Nathan, & Puente,2000; Harrison et al., 1988; Lubin et al., 1986;Watkins, Campbell, Nieberding, & Hallmark,1995). Data from such studies of test use are be-coming increasingly important in light of therole that managed-care companies play in reim-bursement for assessment services. Data fromsurveys that help determine which are the typicalinstruments used for various types of assessmentand the amount of time practitioners usuallyspend on an assessment may serve a function insetting standard approved rates for practitionercompensation by managed-care companies.Thus, we reviewed the recent literature to at-tempt to discover which instruments are mostcommonly used by practitioners with a variety ofbackgrounds and find out how much time is typ-ically spent on assessments.

    Has Test Use Changed over the Years?

    Overall, little substantive change has occurred inthe most popular instruments used in the lastseveral decades (Camara et al., 2000). Test usagewas first documented by Louttit and Brown(1947), with data collected spanning the mid-1930s to the mid-1940s. Since that early survey,subsequent surveys have shown that the mostcommonly used tests have not changed muchover the years. The Wechsler family of tests hasremained on the top of the assessment list formost psychologists, across a variety of settings(Ball, Archer, & Imhof, 1994; Brown & McGuire;1976; Camara et al., 2000; Harrison et al., 1988;Lubin et al., 1971). The WAIS and WAIS-Rhave consistently been mentioned in surveys asthe most often used adult intelligence tests byclinical psychologists, school psychologists, neu-

    ropsychologists, and forensic psychologists, andthe WAIS-III will surely follow suit in futuresurveys.

    Many studies of test usage lump together testsfrom all areas of assessment, including intellec-tual assessment, personality assessment, adaptivefunctioning assessment, achievement assessment,and neuropsychological assessment. Nonethe-less, even when considering all these differenttypes of assessment, the Wechsler tests remainranked in the top 10.

    Because the WAIS-III is fairly new, we wereunable to find any published surveys that re-ported on the latest adult Wechsler test. Themost recent survey at the time that this bookwent to press had a 2000 publication date, butthe authors collected their data in late 1994, be-fore the WAIS-R was revised (Camara et al.,2000). However, it is safe to assume that theWAIS-III will maintain the high ranking en-joyed by the WAIS-R.

    Test Usage of1,500 Psychologists

    and NeuropsychologistsCamara et al.’s (2000) collected survey data ontest usage and assessment from 933 clinical psy-chologists and 567 neuropsychologists who wererandomly selected from the American Psycho-logical Association (APA) and the National Asso-ciation of Neuropsychology (NAN). The authorswere interested in data from practitioners whoconducted assessments on a regular basis, so theyultimately conducted their analyses on data fromrespondents who engaged in 5 or more hours perweek of assessment-related services. Thus, thefinal sample used for ranking test usage com-prised 179 clinical psychologists (19% of theclinical psychologist respondents) and 447 neu-ropsychologists (79% of the neuropsychologistrespondents). Table 1.1 displays the hours spentadministering, scoring, and interpreting psycho-logical tests during a typical week, for the totalnumber of respondents to the survey (N = 1,500).

    ch01.fm Page 8 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 9

    Interestingly, the sample of neuropsychologistsspent many more hours per week doing assess-ments than did the sample of clinical psycholo-gists. Among neuropsychologists, almost 80%spent at least 5 assessment hours per week andabout half spent at least 15 hours a week con-ducting assessments. For clinical psychologists,the corresponding values were about 20% and8%.

    According to Camara et al. (2000), of the clini-cal psychologists who performed assessments 5 ormore hours per week, the majority of their assess-ment time was spent conducting intellectual orachievement testing (34%) and personality testing(32%). For neuropsychologists, their assessmenttime was fairly equally divided between neuro-psychological assessment (26%), intellectual orachievement assessment (20%), and personalityassessment (20%). Watkins et al. (1995) reportedthat 8% of a clinical psychologist’s total time prac-ticing was spent on intellectual assessment, and12% of the total time was spent on personality as-sessment (N = 412). In a study examining assess-ment practices of school psychologists (N = 389),respondents reported that they spent about one

    half of their time in assessment-related activities(Mdn = 50%) (Hutton et al., 1992).

    How FrequentlyAre Tests Used?

    As mentioned, the Wechsler tests have held onstrongly to their place at the top of the heap oftests administered by practitioners over the years.In Camara et al.’s (2000) study, clinical psychol-ogists ranked the WAIS-R the number one testadministered and neuropsychologists ranked itnumber two. Other Wechsler tests were also at thetop of the list: clinical psychologists rated theWISC-III number 3 and neuropsychologists ratedthe Wechsler Memory Scale—Revised number 3.Camara et al. (2000) did not separate children’stests from adults’ tests, or measures of intelligencefrom other measures, such as personality func-tioning. Clinical psychologists ranked the Minne-sota Multiphasic Personality Inventory—SecondEdition (MMPI-II) as the number 1 most fre-quently used test and neuropsychologists rankedit as number 2. Other studies report similar find-ings: in a survey tapping tests administered by

    TABLE 1.1 Hours spent administering, scoring, and interpreting psychological tests duringa typical week

    Clinical Psychologists Neuropsychologists

    Hours n (%) Cumulative % n (%) Cumulative %

    0–4 755 (80.9) 100.0 116 (20.5) 100.05–9 62 (6.6) 18.7 62 (10.9) 78.8

    10–14 39 (4.2) 12.1 92 (16.2) 67.915–20 36 (3.9) 7.9 105 (18.5) 51.7More than 20 37 (4.0) 4.0 188 (33.2) 33.2No response 4 (

  • 10 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    psychologists to adolescent clients, the Wechslerscales were the number one most frequently usedtests (Archer, Maruish, Imhof, & Piotrowski,1991); in a survey of tests administered by forensicneuropsychologists, the WAIS-R, MMPI-II, andWMS-R were ranked numbers 1, 2, and 3, re-spectively (Lees-Hayley et al., 1996); and schoolpsychologists also reported the Wechsler scales asthe most frequently used assessment tools (Hut-ton et al., 1992; Wilson & Reschly, 1996).

    Administration Time andImplications for Reimbursement

    Camara et al. (2000) also examined the meantime to administer, score, and interpret a batteryof tests. The median number of minutes report-edly spent by clinical psychologists on theWAIS-R was administration (75), scoring (20),and interpretation (30), for a total time of a littleover 2 hours; similar values were reported by neu-ropsychologists. Considering that the WAIS-R(or WAIS-III now) is only one component of afull battery, the total time to administer, score,and interpret an entire battery is significantlygreater. Clearly, the time varies depending onthe type of testing necessary to answer the refer-ral questions. That being said, Camara and col-leagues found that, on average, psychologistsspent about 3.5 to 4.25 hours on administering,scoring, and interpreting an assessment battery.However, the authors concede that some areas ofassessment take substantially longer than theseaverage times, especially intellectual and neu-ropsychological assessment.

    The results from Camara et al.’s (2000) studyhave implications for the reimbursement of as-sessments by third parties, especially managed-care companies. The authors note that assessmentservices are often limited to 2 hours of reimburs-able time, the approximate time the psychologistsin Camara et al.’s (2000) study spent administer-ing, scoring, and interpreting the WAIS-R.However, because the Camara data demon-strated that trained practitioners require at least 4

    hours to complete a comprehensive assessment, itis clear that clinicians are limited in what types ofassessments they can provide, if they want to bereimbursed for their time. The consequences oflimited reimbursement for assessment may bethat the number of psychologists conducting as-sessments will diminish. Already, Camara andcolleagues note that almost 90% of clinical psy-chologists spend less than 10 hours a week on as-sessments (see Table 1.1).

    For What Purposes AreAdults Given Intelligence Tests?

    It is clear that the WAIS-R and WAIS-III arewidely used in the field of assessment today, butwhy are these and other intelligence tests typi-cally administered to adults? Harrison et al.(1988) asked that question specifically of a groupof 277 clinical psychologists. In a survey, respon-dents were asked to rank seven purposes forwhich they would administer an intelligence test.The number 1 purpose was to measure the po-tential or cognitive capacity of a person. Table 1.2lists the seven purposes and how important re-spondents felt each was. Although nearly 40–50%of psychologists ranked educational and voca-tional placement or interventions as a purpose forassessing adults, very few felt these are the mainreasons for conducting an assessment (6–17%).Clearly, the data show that clinicians think thatthe most important reasons for assessing adultsare to measure cognitive potential, obtain clini-cally relevant information, and assess functionalintegrity of the brain.

    ConclusionsThe Camara et al. (2000) survey results indicatethat the WAIS-R, and, intuitively, the WAIS-III,is supreme among assessment tools used to as-sess adolescent and adult functioning by clinicalpsychologists and neuropsychologists. These re-sults, in combination with results of other stud-ies, show that the Wechsler tests are equally

    ch01.fm Page 10 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 11

    popular in other domains such as forensic psy-chology, school psychology, hospital settings,and outpatient clinics. The percentage of clinicaltime spent conducting assessments varies acrossspecialties within psychology (e.g., clinical,school, neuropsychology). However, the typicalamount of time necessary to conduct an assess-ment is similar across domains, although it fluc-tuates depending on the type of assessmentnecessary. The inconsistency between the amountof time typically allowed to be reimbursed for as-sessment services and the actual amount of timespent in assessment-related services was pointedout by Camara et al. (2000). Such inconsistencymay affect the types and numbers of assessmentsperformed by clinicians. Notwithstanding thefees and reimbursement issues, the popularity ofthe Wechsler scales and the primary reasons forassessing adults remain unchanged. There ap-pears to be a strong need for tools to assess cog-nitive capabilities and obtain related clinicalinformation in adults, and the WAIS-III is thereto meet those needs for those who choose toconduct assessments. However, clinicians wouldbe wise to consider theory-based alternatives to

    Wechsler’s scales, such as the KAIT and WJ IIITests of Cognitive Ability for adolescents andadults, and the Cognitive Assessment System(CAS; Naglieri & Das, 1997a, 1997b) for adoles-cents. Also, in view of time constraints imposedby managed-care criteria, reliable and valid briefintelligence tests may need to be weighed as pos-sible assessment options (see Chapter 15).

    VALIDITY OF THEIQ CONSTRUCT FOR

    ADOLESCENTS AND ADULTS

    Matarazzo (1972, Chapters 6, 7, and 12) devotedmost of three chapters to support the validity ofthe IQ construct, Jensen (1980) addressed the is-sue from both theoretical and empirical perspec-tives (his Chapters 6 and 8, respectively), andBrody (1985) published a thought-provokingchapter on “The Validity of Tests of Intelligence.”These three esteemed psychologists concluded, inessence, that the IQ construct, as measured by

    TABLE 1.2 Purposes for using intelligence tests when assessing adults

    Purpose

    % of Psychologists Who Assess Adults for

    This Purpose

    % of Psychologists Who Rank This Purpose as Very

    Important

    Measure potential of capacity 85.2 58.5Obtain clinically relevant information 85.2 53.1Assess functional integrity of brain 77.6 43.3Determine educational placement 48.4 17.0Determine vocational placement 45.5 12.3Develop educational interventions 44.0 10.8Develop vocational interventions 39.4 5.8

    NOTE: Data are from Harrison et al. (1988), based on 277 respondents asked to list all the purposes forwhich “you generally use a standardized intelligence test in your assessment battery” and “then rank theones you checked in order of their importance with a 1 as the most important.” The “% of psychologists whorank this purpose as very important” equals the percentage of the total group of 2,787 who assigned each pur-pose a ranking of 1 or 2.

    ch01.fm Page 11 Tuesday, September 25, 2001 2:56 PM

  • 12 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    contemporary intelligence tests, is valid whendefined within the societal context and when theIQ’s limitations are kept fully in mind. In a sur-vey of psychologists and educational specialistswith expertise in areas related to intelligencetesting, Snyderman and Rothman (1987) foundthat, overall, experts hold positive attitudesabout the validity and usefulness of intelligenceand aptitude tests. Although the validity of theIQ construct and the tests purported to assess itare important to this text, we treat it cursorilyhere because it has been thoroughly discussedelsewhere. Our focus is on the following aspectsof IQ’s validity: prediction of academic achieve-ment, relationship to educational attainment,relationship to occupational membership, andprediction of job performance.

    Prediction ofAcademic Achievement

    The age-old IQ criterion of prediction of schoolachievement has been explored in thousands ofstudies across the age range, and Matarazzo(1972) concluded a generation ago that a correla-tion of about .50 exists between IQ and schoolperformance. Coefficients are typically a bithigher in elementary school and lower in college(Brody, 1985). The overall value of .50 is highenough to support the validity of the IQ for thepurpose that Binet originally intended it, but lowenough to indicate that about 75% of the variancein school achievement is accounted for by factorsother than IQ. Some more recent studies withnewer, theory-based intelligence tests have re-ported higher coefficients in the .60–.70 range forthe Horn-based WJ-R (McGrew, Werder, &Woodcock, 1991) and for the Luria-based K-ABCand CAS (Naglieri, 1999, Table 5.5) between in-telligence and achievement. In fact, these coeffi-cients for the theory-based tests are similar inmagnitude to the values obtained with the ThirdEditions of the WISC and WAIS, using WIATscores as the criteria (Psychological Corporation,

    1992, Table D.6). Hence, more recent studieswith new and revised instruments suggest that IQmay explain as much as 50% of school achieve-ment; however, even that substantially highervalue still leaves 50% for other variables.

    For adults, the IQ-achievement correlations areillustrated by correlations between the WAIS-IIIand the Wechsler Individual Achievement Test(WIAT; Psychological Corporation, 1992). Over-all the correlations between the WAIS-III IQsand the WIAT Composites (Reading, Math, andLanguage) range between .53 and .81, with mostcorrelations in the .60s and .70s, and a medianvalue of .70 (Psychological Corporation, 1997).The correlations between the WAIS-III Indexesand the WIAT Composites were slightly lowerthan those with the IQs, with rs ranging from .42to .77 with a median value of .61.

    Wechsler’s Verbal IQ consistently correlatesmore strongly with achievement than does the Per-formance IQ. Correlations between the WAIS-IIIand WIAT exemplify that fact: V-IQ correla-tions range from .70 to .81 with the WIAT Com-posites, whereas P-IQ correlations range from.53 to .69 with the WIAT. Data from the WAIS-IIIindexes also mirror the IQ data. In WAIS-Rstudies (e.g., Ryan & Rosenberg, 1983; Spruill &Beck, 1986), mean correlation coefficients were.65 for Verbal and .54 for Performance. In fiveWAIS studies cited by Matarazzo (1972, p. 284),V-IQ correlated higher than P-IQ with highschool rank (.63 versus .43) and college grade-point average (GPA) (.47 versus .24). NumerousWISC-III investigations summarized by Gridleyand Roid (1998) have also shown stronger correla-tions between achievement ability and Verbal IQthan between Performance IQ and achievement.

    In general, the use of the WAIS-III for predict-ing college achievement is likely to produce coef-ficients lower than the values in the .60s observedwhen standardized achievement tests are the cri-teria. Matarazzo (1972, p. 284), for example, citeda coefficient of .44 between WAIS FS-IQ andGPA for 335 college students with a mean IQ of115, and Jensen (1980, p. 330) reported a median

    ch01.fm Page 12 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 13

    correlation of .40 between the General Intelli-gence test of the General Aptitude Test Battery(GATB) and college grades in 48 different sam-ples (comprising 5,561 students).

    Even if correlation coefficients involving theWAIS-R or WAIS-III account for only 15% to20% of the variability in college students’ grades(compared to 25–50% for elementary and highschool), such values nonetheless strongly sup-port the Wechsler scales’ validity for educationalpurposes. Correlations for college students areattenuated substantially, having nothing to dowith the quality of the instrument because of(1) the restricted range of IQs found in highlyselected samples, (2) the questionable reliabilityand validity of the GPA criterion (it, too, is usu-ally restricted to a 5-point scale from A to F, andcollege grading systems fluctuate notoriouslyfrom instructor to instructor), and (3) the in-creasing role played by nonintellective factorssuch as motivation and study habits.

    Relationship of IQ to EducationFor children’s intelligence tests, correlations be-tween IQ and school achievement are among thebest evidences of validity, but those coefficientsare less valuable for adult tests. The best argu-ments for the validity of an adult test are the re-lationships between IQ and formal educationand between IQ and occupational level (a vari-able that correlates substantially with years ofschooling; Kaufman, 1990). Success in school isa key task of children and adolescents; life ac-complishments are the goals of an adult.

    Logically, people who score higher on a so-called intelligence test should advance higherwithin the formal education hierarchy and shouldassume positions within the more prestigious oc-cupations. Which is cause and which is effect isnot relevant to this point. Perhaps individualsscore higher on IQ tests because of what theylearn in school; perhaps they proceed to higherlevels of education because they are smart to be-

    gin with; or perhaps these two variables combinein some way. In any case, a strong relationshipbetween education and IQ supports the con-struct that underlies tests that purport to mea-sure intelligence.

    This relationship is explored in depth for theWAIS-III in Chapter 4, and again in Chapter 8regarding V-P differences and brain damage.The present discussion gives only an overview ofthe relationship between years of schooling andWAIS-III scores in order to illustrate the over-whelming validity support for the WAIS-IIIwhen educational attainment is the criterion.

    Educational data that are available for theWAIS-III Full Scale IQ are age-corrected zscores, predicted by education; these data werekindly provided by Heaton, Manly, Taylor, andTulsky (personal communication, September,2000), with the permission of The PsychologicalCorporation, and are discussed more fully inChapter 4. Briefly, mean Full Scale IQs for 16-to 89-year-olds with different formal educationlevels ranged from 80.5 for individuals with 0–7years of schooling to 116.8 for those with 17 ormore years.

    The two extreme educational groups differ byabout 36 points, more than two standard devia-tions! These differences tend to be larger forVerbal than Performance subtests, but they arenonetheless substantial even for tasks like BlockDesign or Digit Symbol-Coding that are notspecifically taught in the classroom. The meanscaled-score differences for those with 17 ormore years of schooling versus those with 7 orless years of schooling (for ages 20 to 89) on twoselected WAIS-III subtests, one closely relatedto the specific content taught in school (Vocabu-lary) and one unrelated to curriculum (BlockDesign), are 6.60 and 4.47. Specifically, on Vo-cabulary the mean scaled score for those with17+ years of schooling was 13.33, whereas it wasonly 6.83 for those with seven or fewer years ofschooling. In contrast, on Block Design meanscaled score for those with 17+ years of schoolingwas 11.92, yet it was only 7.45 for those with

    ch01.fm Page 13 Tuesday, September 25, 2001 2:56 PM

  • 14 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    seven or fewer years of schooling. Thus, the veryhighly educated adults scored 2.2 SD higher thanrelatively uneducated adults on Vocabulary and1.5 SD higher on Block Design.

    These data show that relatively uneducatedpeople perform poorly on both school-relatedand school-unrelated tasks, and that both typesof tests are substantially related to formal educa-tion. As indicated, however, highly educatedadults have a greater advantage on crystallizedthan on fluid tasks (i.e., on Information or Vo-cabulary than on Block Design). Data from theFels Longitudinal Study (McCall, 1977) revealthat childhood IQs correlate about .50 (±.10)with both adult educational and occupational at-tainment, stabilizing at that relatively high levelat ages 7 to 8 for males and females.

    The strong relationship between IQ and for-mal education should not obscure the consider-able variability of IQs earned by individuals withthe same educational attainment. Fluctuations inWAIS-R IQ by education level were shown byReynolds et al. (1987), and are presented in Ta-ble 4.5. These results indicate that each level ofeducational attainment is accompanied by a widerange of Full Scale IQs. For example, individualswith some college education have a higher meanIQ by about 11 points than those with some highschool, but their IQ ranges are fairly similar: 76–139 for those with 13–15 years of schoolingcompared to 59–146 for those with 9–11 years ofschooling.

    IQ and OccupationFor ages 20 to 54, WAIS-R data provide addi-tional validation evidence for Wechsler’s IQs byexamining mean scores earned by adults activelyengaged in different levels of occupation (Rey-nolds et al., 1987). Adolescents have been elimi-nated from consideration because occupationaldata are based on their parents’ occupation, andthe 55–74-year-olds have been eliminated be-cause two thirds are categorized as “Not in La-bor Force.”

    Occupational data are treated in depth inChapter 4, and are summarized here to illustrate

    the validity of the IQ construct. Mean Full ScaleIQs are shown in Table 1.3 for five categories ofoccupation, listed in order of the average educa-tional level (from high to low) that typifies eachcategory. These values range from about 87 forunskilled workers to 112 for professionals andtechnical workers.

    The 25-point difference between professionalsand unskilled workers, combined with the educa-tional data, gives strong support to the constructunderlying Wechsler’s Full Scale IQs for adultsamples; occupational and educational data pre-sented in Chapter 4 give substantial validity sup-port for the separate Verbal and PerformanceIQs as well.

    In general, the relationship between occupa-tion, education, and WAIS-R IQs for persons 75years and older was similar to that found by Rey-nolds et al. (1987) for persons 16 to 74 (Ryan,Paolo, & Dunn, 1995). When past occupationwas measured in an elderly sample (ages 75+), in-dividuals who were retired professionals or man-agers earned WAIS-R Full Scale IQs that were15.78 points higher than those who were retired

    TABLE 1.3 Mean WAIS-R Full Scale IQs for 20- to 54-year-olds employed in different levels of occupation

    Occupational Group

    MeanWAIS-R

    Full Scale IQ

    1. Professional and technical 112.42. Managers and administrators,

    clerical workers, and sales workers

    103.6

    3. Skilled workers (craftsmen and foremen)

    100.7

    4. Semiskilled workers (operatives, service workers—including private household—farmers, and farm managers)

    92.3

    5. Unskilled workers (laborers, farm laborers, farm foremen)

    87.1

    NOTE: Data are from Reynolds et al. (1987).

    ch01.fm Page 14 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 15

    laborers or operatives. Education was also an im-portant variable in this elderly sample, as it ac-counted for 30% to 43% of the variance in theWAIS-R IQs. Similar to results with youngeradults, there were substantial differences (17.05points) between those with the most education(12 or more years) and those with less formalschooling (0 to 11 years). As the relationship be-tween education and occupation is known to bequite strong, Ryan et al. (1995) performed analy-ses to determine whether preretirement occupa-tion would explain an additional amount ofvariance in IQ over and above age and education.Occupation did, in fact, contribute significantlyto all WAIS-R IQs, explaining an additional 3%to 6% of the variance in the Verbal, Perfor-mance, and Full Scale IQs, beyond that of ageand education.

    When IQs are provided for specific jobs in-stead of general categories, even wider discrep-ancies emerge between diverse occupations. Forexample, Matarazzo (1972, pp. 178–180) citesnumerous studies and his own considerable clin-ical experience to show that physicians, medicalstudents, dentists, university professors, psychia-trists, executives in industry, scientists, and attor-neys have consistently averaged IQs of 125 onthe Wechsler-Bellevue and WAIS. In a study of 35medical students, Mitchell, Grandy, and Lupo(1986) reported mean Full Scale IQs in the samerange on both the WAIS (124.5) and WAIS-R(120.8).

    The wide range of mean scores by people indifferent occupations is further illustrated by acomprehensive (N = 39,600) 1970 U.S. Depart-ment of Labor study cited by Jensen (1980,pp. 341–342). Mean IQs on the GATB GeneralIntelligence scale were provided for 444 specificoccupations, and ranged from 55 for TomatoPeeler to 143 for Mathematician. Although theGATB General Intelligence score correlated .89with the WAIS (Jensen, 1980), the two scaleshave different standard deviations. When theGATB scores for Tomato Peelers and Mathema-ticians are converted to the Wechsler metric, themeans become 66 and 132, respectively. Thisdiscrepancy is not as impressive as the 88-point

    difference on the GATB scale (mean of 100, SDof 20), but it nonetheless provides additional ev-idence of the IQ construct’s validity.

    Figure 1.1, adapted from Matarazzo (1972,p. 178) and Jensen (1980, p. 113) and modifiedbased on WAIS-R data reported by Reynolds etal. (1987), presents graphically the educationalor occupational referents of different IQ levels.However, these values are just the averages fordifferent jobs or educational accomplishments.As Matarazzo (1972) and Jensen (1980) stress,adults in each occupation or educational cate-gory vary considerably in IQ range. Table 4.5presents pertinent data that reveal the fairly widerange of IQs for individuals from the same occu-pational category (as mentioned previously, thissame table shows the wide IQ ranges for peoplewith different levels of education). For occupa-tional groups, the range is relatively small forpeople employed in routine, menial jobs usuallyreserved for the mentally retarded, but substan-tial IQ ranges characterize members in jobs asdiverse as physicians or policemen or even un-skilled construction workers.

    The strong relationship depicted here be-tween IQ and occupation may be an artifact ofthe even stronger relationship described previ-ously between IQ and educational attainment.Occupation and education correlate substantially,particularly because advanced formal education isfrequently a prerequisite for many high-prestigeoccupations. Gottfredson and Brown (1981) ob-served an interesting age-related finding in theoccupation–education relationship in their large-scale longitudinal study. Occupational status cor-related a modest .17–.20 with years of schoolingat ages 18–20 years, but increased at age 22 (.45)and age 24 (.60) before plateauing in the mid-.60sfor 26- and 28-year-olds. Gottfredson and Browninterpreted these age-related findings as a func-tion of the facts that (1) the later entrants into thework force are brighter and better educated, and(2) among those already employed, the smarterand more educated adults advance from low-levelto high-level positions.

    Crawford and Allan (1997), studying a groupof 200 adults ages 16 to 83 (M = 44.3 years) from

    ch01.fm Page 15 Tuesday, September 25, 2001 2:56 PM

  • 16 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    the United Kingdom, found that occupation wasa slightly stronger predictor of WAIS-R IQ thanwas education. The correlations for this samplebetween occupation and FS-IQ (.65), educationand FS-IQ (.58), and education and occupation(.65) are within the ranges of what has been pre-viously reported. However, Crawford and Allanfound that occupation was the single best predic-tor of IQ for all three scales in a stepwise regres-sion. Occupational classification accounted for42%, 43%, and 25% of the variance in FS-IQ,V-IQ, and P-IQ, respectively. Education and agesignificantly increased the variance predicted,with final models predicting 32% to 53% of thevariance in the IQs. Thus, it appears that occu-pation in and of itself is an important demo-graphic variable contributing to IQ.

    Regardless, years of schooling “is the singlemost important determinant of occupational sta-tus in United States society” (Brody, 1985,p. 361). Brody states further that the results ofpath analysis in several studies indicate that IQhas “a large influence on educational attainment

    and relatively little indirect influence on occupationalstatus” (pp. 361–362, italics ours)—that is, sepa-rate from the IQ–education relationship.

    Prediction of Job PerformanceAverage correlations between general intelligenceand job proficiency are traditionally in the .20s(Ghiselli, 1966, 1973). However, because thepredictors and criteria are typically restricted invariability due to selection factors and otherpractical limitations of test validation in indus-trial settings, some have argued that such coeffi-cients require statistical correction to reflectmore accurately the “true” relationship betweenIQ and job success (Hunter & Hunter, 1984).For the purpose here (i.e., to determine the va-lidity of the theoretical construct underlying in-telligence tests), the corrected values seem moreappropriate.

    In an ambitious meta-analysis of hundreds ofstudies relating intelligence to job performance,Hunter (1986) concluded that “general cognitive

    FIGURE 1.1Mean Wechsler adult IQs that correspond to different educational andoccupational accomplishments (based on data on Table 7–3 of Matarazzo, 1972,p. 178; data in Table 4.5 in Jensen, 1980, p. 113; WAIS-R standardization datareported by Reynolds et al., 1987).

    Ph.D or M.D.

    College Graduate

    Clerical/Sales

    High School Graduate

    Semiskilled Work

    50% Reach High School

    Harvest Vegetables

    Perform Simple Carpentry

    Mow Lawns

    0 20 40 60Mean IQ

    80 100 120

    ch01.fm Page 16 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 17

    ability has high validity predicting performanceratings and training success in all jobs” (p. 359).He organized data from three major sources,correcting coefficients for restriction of range inall cases, and for attenuation (imperfect test reli-ability) in the first two sets of studies: (1) Ghis-elli’s lifework, involving several summaries of aquarter-century’s worth of validity studies in in-dustry on the prediction of job proficiency andsuccess in training programs; (2) 515 validationstudies conducted by the U.S. Employment Ser-vice with the GATB, 425 on job proficiency (N =32,124) and 90 on training success (N = 6,496);and (3) U.S. military studies of training successin mechanical, clerical, electronic, and generaltechnical fields (828 studies totalling 472,539subjects).

    Coefficients of correlation between intelli-gence and job proficiency (performance ratings)were consistently higher for complex jobs thanfor those demanding less complexity. The Ghis-elli studies produced substantial corrected corre-lations for the complex jobs of manager (.53),clerk (.54), and salesperson (.61). Coefficients inthe mid-.40s were obtained for jobs of mediumcomplexity (e.g., crafts and trades), while valuesin the high .20s and .30s were typical of low com-plexity jobs like vehicle operator. Similar averagesemerged when Hunter (1986) grouped the U.S.Employment Service studies by complexity: highcomplexity (r = .58), medium (.51), and low (.40).Gottfredson (1997) suggested that general intel-ligence (g) has pervasive utility in work settingsbecause it is related to one’s ability to deal withcognitive complexity. She noted that the morecomplex a work task, the greater the advantagesthat higher g confers in performing it well.

    Intelligence correlated even more impres-sively with success in training than it did with jobperformance. Further, the coefficients obtainedfor various training programs were about equallygood, regardless of job complexity. The averagecorrected coefficient for the 828 studies of train-ing success conducted by the U.S. military was.62, with values hovering around that overallvalue for each of the four job families (i.e., me-

    chanical, clerical, electronic, and general techni-cal). Coefficients from the Ghiselli summariesranged from .37 (vehicle operator) to .87 (pro-tective professions) with a median correlation of.65 across seven categories of jobs. The 90 train-ing studies carried out by the U.S. EmploymentService yielded average values of .50 to .65 (me-dian = .56) for jobs grouped into four categories.

    Hunter showed further that validity coeffi-cients are even higher when objective work sam-ples of job performance are used instead ofsubjective supervisor ratings. Based on a handfulof particularly well-designed investigations thatused objective criteria to evaluate job profi-ciency, corrected correlations were .75 in civiliandata and .53 in military data.

    In a more recent synthesis of the vocationaldata, Schmidt and Hunter (1998) reviewed 85years of research in personnel selection, focusingon the results of the best meta-analyses, includ-ing much of the data reviewed in the precedingparagraphs. They concluded once again that IQ(referred to as general mental ability or GMA)had strong validity, and that the validity could beincreased substantially when other predictors areconsidered as well: .63 (GMA + work sample orGMA + structured interview) or .65 (GMA + in-tegrity test). Based on their review, Schmidt andHunter concluded: (1) “of all procedures thatcan be used for all jobs, whether entry level oradvanced, [GMA] has the highest validity andlowest application cost” (p. 264); (2) “the re-search evidence for the validity of GMA mea-sures for predicting job performance is strongerthan that for any other measure” (p. 264); and(3) “GMA has been shown to be the best avail-able predictor of job-related learning” (p. 264).

    Jensen’s (1980) analysis of some of the same datasummarized by Hunter (1986) presents a more so-bering view of the ability of intelligence tests topredict job performance and training success. Co-efficients reported by Hunter were corrected forrestriction of range and, usually, for attenuation aswell; these corrections inflate the correlations byestimating their magnitude in “what-if” situations.The correction for attenuation (test unreliability)

    ch01.fm Page 17 Tuesday, September 25, 2001 2:56 PM

  • 18 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    is particularly questionable, however, because, bydefinition, tests are not perfectly reliable. Jensen(1980, pp. 347–350) notes that Ghiselli’s actual co-efficients were in the .20 to .25 range, on the aver-age, and that the median coefficient for the GATBGeneral Intelligence score for 537 U.S. Employ-ment Service studies was .27.

    Similarly, Jensen demonstrates that correla-tions are greater for more complexions but thatthe values for jobs with high complexity are inthe .35 to .47 range. Jensen also notes that theaverage correlation between IQ and success intraining programs is close to .50, not the valuesof about .60 reported by Hunter. These criti-cisms apply as well to the more recent review bySchmidt and Hunter (1998).

    Data from both Hunter (1986) and Jensen(1980) support the IQ construct as reasonablyvalid in its role as predictor of job success,although the claims made by Hunter may be ex-aggerated by his incautious and, perhaps, over-zealous correction of obtained coefficients. Froma theoretical perspective, the data set evaluated byHunter and Schmidt and Hunter (1998) give ex-cellent support of the construct validity of IQ invocational settings. In a practical sense, however,the obtained correlations are often the most perti-nent. In all instances, readers are wise to heed thecautions of two expert statisticians and psychome-tricians, Lloyd Humphreys and Robert Linn,regarding Hunter’s correction procedures. Hum-phreys (1986), in his commentary on Hunter’s ar-ticle (and other papers as well) in a special issue ofthe Journal of Vocational Behavior, wrote, “Giventhe heterogeneity among the many studies to beaggregated, corrections for measurement errorand restriction of range of talent are rough esti-mates at best” (p. 427). In a similar commentary,Linn (1986) asserted that “adjustments for rangerestriction and attenuation are nontrivial[;]...correlations that are changed dramatically by ad-justments should always be viewed with caution”(pp. 440–441).

    Although IQ seems to be a valid predictor ofjob performance, the general findings from thisline of research indicate that a relatively smallamount of the variance in job performance is ac-

    counted for by IQ. At worst, the average validitycoefficient between measures of cognitive abilityand measures of cognitive ability is .20 (Ghiselli,1966; Wigdor & Garner, 1982), accounting foronly 4% of the variance, and, at best, the averagevalidity coefficient is about .5 (Hunter & Hunter,1984; Schmidt & Hunter, 1998), accounting for25% of the variance in job performance. AsSternberg, Wagner, Williams, and Horvath(1995) point out, these values leave at least three-quarters of the variance unexplained. Sternberget al. suggest that practical intelligence (commonsense) is a variable that may contribute to the pre-diction of job performance, above and beyondwhat traditional IQ contributes. Practical intelli-gence, or “tacit knowledge,” has only a small re-lationship to general intelligence (Sternberg etal., 1995). When tasks of tacit knowledge areused to predict managerial performance, tacitknowledge accounts for substantial and signifi-cant increases in variance above and beyond IQ(Wagner & Sternberg, 1990). Using measures oftraditional intelligence in conjunction with mea-sures of tacit knowledge may more effectivelypredict job performance than reliance on one ofthese measures alone (Sternberg et al., 1995), al-though reliable and construct-valid measures oftacit knowledge are not yet available.

    THE INTELLIGENTTESTING PHILOSOPHY

    One’s philosophy regarding the interpretation ofindividually administered clinical tests should bean intelligent one. The approach we will be de-scribing has been spelled out in detail for variousWechsler tests (Kaufman, 1979a, 1994a; Kauf-man & Lichtenberger, 1999, 2000), applied tothe K-ABC (Kamphaus & Reynolds, 1987), andapplied to a variety of other clinical and neuro-psychological instruments (Reynolds & Fletcher-Janzen, 1989). Consequently, our goal here isonly to summarize the assumptions underlyingthe approach and the basic methodology thatcharacterizes it. The essential method is the

    ch01.fm Page 18 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 19

    same, whether applied to tests for children, ado-lescents, or adults. Intelligent testing rests onfive assumptions, discussed in the sections below:

    1. IQ tasks measure what the individual haslearned.

    2. IQ tasks are samples of behavior and are notexhaustive.

    3. IQ tests like the WAIS-III, KAIT, and WJ IIIassess mental functioning under fixed experi-mental conditions.

    4. IQ tests are optimally useful when they are in-terpreted from an information-processingmodel.

    5. Hypotheses generated from IQ test profilesshould be supported with data from multiplesources.

    IQ Tasks Measure Whatthe Individual Has Learned

    This concept comes directly from Wesman’s(1968) introduction of the intelligent testing ap-proach. The content of all tasks, whether verbalor nonverbal, is learned within a culture. Thelearning may take place formally in the school, ca-sually in the home, or incidentally through every-day life. As a measure of past learning, the IQ testis best thought of as a kind of achievement test,not as a simple measure of aptitude. Like the SAT,IQ tests assess “developed abilities, broadly applica-ble intellectual skills and knowledge that developslowly over time through the individual’s experi-ences both in and out of school...[that are] nottied to the content of any specific course or fieldof study” (Anastasi, 1988, p. 330).

    The interaction between learning potentialand availability of learning experiences is toocomplex to ponder for any given person, makingthe whole genetics–environment issue of theo-retical value, but impractical and irrelevant forthe interpretation of that person’s test profile.Even the sophisticated scientific challenges tothe IQ construct issued by Lezak (1988a) andSiegel (1999) or the emotional, less informed in-

    dictments of IQ tests handed out by members ofthe public, become almost a side issue when thetests are viewed and interpreted simply as mea-sures of accomplishment. The term achievementimplies a societal responsibility to upgrade thelevel of those who have not attained it; the termaptitude implies something inborn and personaland can justify a withdrawal of educational re-sources (Flaugher, 1978).

    Issues of heredity versus environment and thevalidity of the IQ construct are meaningful forunderstanding the multifaceted intelligence con-struct; the accumulating research helps test devel-opers, practitioners, and theoreticians appreciatethe foundation of the tests used to measure intel-ligence; and the IQ tests, as vehicles for the re-search, are essential sources of group data for usein scientific study of these topics. But all of thecontroversy loses meaning for each specific per-son referred for evaluation when the clinician ad-ministers an IQ test to study and interpret justwhat the person has or has not learned and to helpanswer the practical referral questions.

    IQ Tasks Are Samples of Behavior

    and Are Not ExhaustiveThe individual Wechsler subtests, or the subteststhat compose the KAIT or WJ III, do not reflectthe essential ingredients of intelligence whosemastery implies some type of ultimate lifeachievement. They, like tasks developed by Binetand other test constructors, are more or less ar-bitrary samples of behavior. Teaching peoplehow to solve similarities, assemble blocks tomatch abstract designs, or repeat digits backwardwill not make them smarter in any broad or gen-eralizable way. What we are able to infer fromthe person’s success on the tasks and style of re-sponding to them is important; the specific,unique aspect of intellect that each subtest mea-sures is of minimal consequence.

    Limitations in the selection of tasks necessar-ily mean that one should be cautious in generaliz-ing the results to circumstances that are removed

    ch01.fm Page 19 Tuesday, September 25, 2001 2:56 PM

  • 20 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    from the one-on-one assessment of a finite num-ber of skills and processing strategies. Intelligencetests should, therefore, be routinely supplementedby other formal and informal measures of cogni-tive, clinical, and neuropsychological functioningto facilitate the assessment of mental functioningas part of psychodiagnosis. The global IQ on anytest, no matter how comprehensive, does notequal a person’s total capacity for intellectualaccomplishment.

    IQ Tests Like the WAIS-III, KAIT, and WJ III Assess

    Mental Functioning under Fixed Experimental Conditions

    Standardized administration and scoring meansconducting an experiment with N = 1 every timean examiner tests someone on an intelligence test.For the results of this experiment to be meaning-ful, the experimenter–examiner must adhereprecisely to the wording in the manual, give ap-propriate probes as defined in the instructions,time each relevant response diligently, and scoreeach item exactly the way comparable responseswere scored during the normative procedure. Fol-lowing these rules prevents examiners from ap-plying a flexible clinical investigatory procedureduring the administration (like Piaget’s semistruc-tured méthode clinique), from teaching the task orgiving feedback to a person who urgently desiresthis intervention, or from cleverly dislodgingfrom the crevices of a person’s brain his or hermaximum response to each test item.

    It is necessary to be an exceptional clinician toestablish and maintain rapport and to weave thestandardized administration into a natural, pleas-ant interchange between examiner and subject.Clinical skills are also essential when observingand interpreting a person’s myriad behaviorsduring the examination and during interpreta-tion of all available information and data wheninterpreting the profile of test scores. But it is vi-tal for an examiner to follow the standardizedprocedures to the letter while administering the

    test; otherwise, the standard scores yielded forthe person will be invalid and meaningless. Toviolate the rules is to negate the value of the me-ticulous set of norms obtained under experimen-tal conditions by most major test-publishingcompanies for their tests.

    The testing situation has a certain built-in ar-tificiality by virtue of the stopwatch, the precisewords to be spoken, and the recording of almosteverything spoken by the examinee. A personwith excellent visual–spatial and manipulativeskills might perform slowly and ineffectively onObject Assembly because of anxiety caused bythe time pressure; or a person with an impressivestore of general knowledge and a good common-sense understanding of social situations may failseveral Information and Comprehension itemsbecause of failure to understand some of thequestions. It is tempting to give credit to a puzzlesolved “just 2 or 3 seconds overtime” or to sim-plify the wording of a question that the person“certainly knows the answer to.” But the goodexaminer will resist these temptations, knowingthat the people in the reference group did not re-ceive such help. Testing the limits on a subtestcan often give valuable insight into the reasonsfor failure or confusion, so long as this flexible,supplemental testing occurs after the score hasbeen recorded under appropriate conditions.

    In an experiment, the empirical results are oflimited value until they are interpreted and dis-cussed in the context of pertinent research andtheory by a knowledgeable researcher. By thesame token, the empirical outcomes of an IQ testare often meaningless until put into context bythe examiner. That is the time for a clinician’sacumen and flexibility to be displayed.

    IQ Tests Are Optimally Useful When

    They Are Interpreted from an Information-Processing Model

    One of the examiner’s jobs in an assessment is toidentify specific areas of dysfunction. One model

    ch01.fm Page 20 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 21

    that has been particularly useful to clinicians in de-lineating areas of dysfunction is the information-processing model (Silver, 1993). The information-processing model is applicable to the learningprocess in general and any given cognitive task.The four components of the model are shown inFigure 1.2.

    The information-processing model can beused as a conceptual framework for interpretingIQs, Factor Indexes, and scaled scores that ex-tends beyond the specific areas obtained (Kauf-man, 1994a). With the help of this model, scores

    can be reorganized and translated into funda-mental areas of strength and weakness within thecognitive profile.

    Generally, the input of WAIS-III Verbal sub-tests tends to be auditory, while that of the Perfor-mance subtests is visual. Although it is perhapssimplistic to reduce the input of WAIS-III sub-tests into a verbal–visual dichotomy, in a rudi-mentary way, all subtests can be categorized ashaving one or the other types of input. For theKAIT and WJ III, there is no simple relationshipbetween scales and modalities. For example, theKAIT Logical Steps subtest is on the Fluid Scale(akin to Performance Scale), but it requires goodverbal comprehension for success.

    Hypotheses Generated from IQ Test Profiles

    Should Be Supported withData from Multiple Sources

    Test score profiles are optimally meaningful wheninterpreted in the context of known backgroundinformation, observed behaviors, and approachto each problem-solving task. Virtually any ex-aminer can deduce that WAIS-III Verbal IQ,KAIT Crystallized IQ, or WJ III Comprehension-Knowledge standard score is not a very good mea-sure of the crystallized intelligence of a personraised in a foreign culture, a person who under-stands Spanish or Vietnamese far better than En-glish, or a person with a hearing impairment, andthat Wechsler’s Performance IQ or KAIT Mem-ory for Block Designs does not measure nonverbalintelligence very well for a person with cripplingarthritis or a visual handicap. The goal of the in-telligent tester is to deduce when one or more sub-tests may be an invalid measure of a person’sintellectual functioning for more subtle reasons:distractibility, poor arithmetic achievement inschool, subcultural differences in language or cus-tom, emotional content of the items, suspected orknown lesions in specific regions of the brain, fa-tigue, boredom, extreme shyness, bizarre thoughtprocesses, inconsistent effort, and the like.

    FIGURE 1.2Information-Processing Model

    InputHow information

    from sensesenters the brain

    IntegrationInterpreting andprocessing the

    information

    StorageStoring the

    information forlater retrieval

    OutputExpressing

    information vialanguage or

    muscle activity

    ch01.fm Page 21 Tuesday, September 25, 2001 2:56 PM

  • 22 PART I INTRODUCTION TO THE ASSESSMENT OF ADOLESCENT AND ADULT INTELLIGENCE

    Being a great detective, able to follow up leadsand hunches about peaks and valleys in a profile,is the hallmark of an intelligent tester. Such atester will integrate IQ test profiles with back-ground information, clinical observations of be-haviors, and other tests administered in order tomore fully understand the examinee’s profile.

    Tying Together theTenets of Intelligent Testing

    The principles discussed in the preceding sec-tions direct our attention to one importantpoint: the focus of any assessment is the personbeing assessed, not the test. Many psychologicalreports stress what the scales or subtests measureinstead of what aspects of the person are particu-larly well developed or in need of improvement;many reports are so number-oriented that thereader loses sight of the person’s uniqueness.Current IQ tests for adolescents and adults en-able psychologists to better understand a per-son’s cognitive functioning, but other facets of anindividual are also revealed during an assessmentand should be fully integrated to represent thatperson as a whole. Although the section of an as-sessment report that systematically reports andinterprets the IQs, cluster scores, and subtestscores is valuable, the behavioral observationssection of a case report is often more revealing,and ultimately of more value, if it helps to ex-plain how or why examinees arrived at the scoresthat they did. The content of the responses andthe person’s style of responding to various typesof tasks can be more important as a determinerof developmental level and intellectual maturitythan the scores assigned to the items or tasks.

    When several tests are administered to a per-son (intelligence, language, achievement, per-sonality, visual–motor), the results must beintegrated from one test battery to the other. In-telligent testing does not apply only to the inter-pretation of intelligence tests. The examiner’smain role is to generate hypotheses that pertainmostly to assets and deficits within the informa-

    tion-processing model, and then confirm ordeny these hypotheses by exploring multiplesources of evidence. This integrative, flexible,clinical–empirical methodology and philosophy,as outlined in the preceding tenets, representsthe approach taken in this book for the interpre-tation of the WAIS-III, KAIT, WJ III, and othertests for adolescents and adults. The guidelinesfor interpreting IQ test profiles and the illustra-tive case reports throughout this book rest sol-idly on the intelligent testing framework.

    SUMMARY

    This chapter first delineates the goal of this bookto serve as a text on individual, clinical assessmentof intelligence and then outlines the five sectionsthat make up the book: (1) introduction to theassessment of adolescent and adult intelligence;(2) individual differences on age, socioeconomicstatus, and other key variables; (3) integration andapplication of WAIS-III research; (4) interpreta-tion of the WAIS-III profile; and (5) additionalmeasures of adolescent and adult IQ. The re-mainder of the chapter sketches a brief history ofthe IQ, gives survey data of test usage, presentsevidence for the validity of the IQ construct, andintroduces the intelligent testing philosophy.

    Alfred Binet was truly the pioneer of IQ test-ing. His concepts and approach dominated thefield for years, and Terman’s adaptation, the Stan-ford-Binet, became the criterion of intelligence inthe United States. The nonverbal Performancetests developed during World War I to assess non-English-speaking recruits, low-functioning indi-viduals, and suspected malingerers joined withthe verbal-oriented Binet tradition to pave theway for David Wechsler’s creative contribution ofa dual Verbal and Performance approach to intel-lectual assessment. Wechsler went on to become aproponent of clinical, not just psychometric, as-sessment. The need for multiscore measurementthat accompanied the learning disabilities move-ment in the 1960s catapulted the Wechsler series

    ch01.fm Page 22 Tuesday, September 25, 2001 2:56 PM

  • CHAPTER 1 IQ TESTS: THEIR HISTORY, USE, VALIDITY, AND INTELLIGENT INTERPRETATION 23

    of scales ahead of the Binet as the most popularintelligence test.

    The results of recent surveys on test usageshow that the Wechsler tests still are stronglypopular in clinical psychology, neuropsychology,forensic psychology, school psychology, hospitalsettings, and outpatient clinics. The percentageof clinical time spent conducting assessmentsvaries across specialties within psychology (e.g.,clinical, school, neuropsychology), with fluctua-tions depending on the type of assessment neces-sary. The inconsistency between the amount oftime typically allowed to be reimbursed for as-sessment services and the actual amount of timespent in assessment-related services may affectthe types and numbers of assessments performedby clinicians. Notwithstanding the fees and re-imbursement issues, the popularity of the Wech-sler scales and the primary reasons for assessingadults remain unchanged. A strong need fortools to assess cognitive capabilities and obtainrelated clinical information in adults will un-doubtedly keep the WAIS-III in its place at thetop of the heap of assessment measures.

    The validity of the IQ construct was exploredfor adolescents and adults. Empirical evidence

    supports the IQ as a good predictor of academicachievement for college students and clinical re-ferrals, and as a strong correlate of educationalattainment; IQ also relates substantially to thestatus of an occupation and correlates signifi-cantly with job performance, especially with suc-cess in training programs. In general, validityevidence is provided for both verbal and nonver-bal measures of intelligence.

    The intelligent testing philosophy, which con-siders the clinician’s expertise and training to bemore important an aspect of the assessment pro-cess than the specific instruments administeredor the scores obtained, embodies the followingprinciples: (1) IQ tasks measure what the indi-vidual has learned; (2) IQ tasks are samples ofbehavior and are not exhaustive; (3) IQ tests likethe WAIS-III, KAIT, and WJ III assess mentalfunctioning under fixed experimental condi-tions; (4) IQ tests are optimally useful when theyare interpreted from an information-processingmodel; and (5) hypotheses generated from IQtest profiles should be supported with data frommultiple sources.

    ch01.fm Page 23 Tuesday, September 25, 2001 2:56 PM