The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

26
•• •• •• •• •• •• ••• •• •• •• •• •• •••• -, The BEHAVIORAL MEASUREMENT Letter Behavioral. Measurement Database Services ••• •• •••• •• •• ••• •• •• ••• •••••• ••• Enriching the health and behavioral sciences by broadening instrument access •• •••• •• ••• •••• ••••••••••••••• •••• ••• •• •• Vol. 7, No.2 Winter 2002 ••• •• Introduction to This Issue This issue of The Behavioral Measurement Letter has three featured articles: one describes a process through which measurement instruments may be improved in various ways and outlines a model for revising instruments; another discusses self-estimated intelligence, gender differences in overestimations and underestimations of intelligence, and possible explanations for and consequences of these differences; the third piece presents ten very practical guidelines -- a list of "do's and don'ts" -- for selecting measurement instruments that will best meet one's needs. In "Refinements to the Lubben Social Network Scale: The LSNS-R," James Lubben, Melanie Gironda, and Alex Lee describe in detail the process of successive changes and analyses thereof through which they modified the Lubben Social Network Scale to improve it in various ways. The process involved performing principal component analysis to determine which items in the instrument contribute to underlying factors being measured by the instrument, and thus which items best measure what they want the instrument to measure. This analytical process resulted in a "cleaner and meaner" version of the LSNS, the LSNS-R. In addition, by detailing the process of successive modifications and mathematical analyses that produced the LSNS-R, the authors outline a model for instrument revision that is broadly applicable. In a piece titled "Self-Estimated Intelligence," Adrian Furnham from London University reviews literature pertaining to estimation of intelligence by oneself and others, and then Vol. 7, No.2, Winter 2002 compares, contrasts, and offers explanations for the reported findings. The literature he reviewed shows that there is a gender difference in intelligence estimation -- females tend to underestimate their IQ and that of other females, while males, on the other hand, tend to overestimate their IQ and that of other males, and that this gender difference exists across cultures and nationalities, across socio- economic classes, and across age groups. He then explores possible explanations for the gender difference and suggests possible life consequences of over- and under-estimating one's abilities. Our regular contributor, Fred Bryant, presents a set of guidelines for choosing measurement tools, "Ten Commandments for Selecting Self- Report Instruments." Some of the guidelines may seem to be common-sense. Many readers will be familiar with their basic content. Those experienced in instrument selection may see mistakes they've made in the past (and vowed never to make again). In any case, all of our readers, from neophytes to seasoned instrument users, will find value in the piece -- if not something they hadn't known previously, then, certainly, practical reminders of the numerous subtleties, cautions, and pitfalls that should be attended to while selecting a measurement instrument. Address comments and suggestions to The Editor, The Behavioral Measurement Letter, Behavioral Measurement Database Services, PO Box 110287, Pittsburgh, PA 15232-0787. If warranted and as space permits, your communication may appear as a letter to the The Behavioral Measurement Letter

Transcript of The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Page 1: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

••••• • •••••••••••••••• • ••••••••••••••••-, The

BEHAVIORALMEASUREMENT

Letter

Behavioral.MeasurementDatabaseServices

••• • ••• • • • • • • • ••• • • • • • • • • • • • • • • • • • • • • ••••••Enriching the health and behavioral sciences by broadening instrument access• • • • • • • •• • ••• • • • • • •••••••••••••••••••••••• •

•• • • • • •Vol. 7, No.2Winter 2002•••••••

Introduction to This Issue

This issue of The Behavioral MeasurementLetter has three featured articles: one describes aprocess through which measurementinstruments may be improved in various waysand outlines a model for revising instruments;another discusses self-estimated intelligence,gender differences in overestimations andunderestimations of intelligence, and possibleexplanations for and consequences of thesedifferences; the third piece presents ten verypractical guidelines -- a list of "do's and don'ts"-- for selecting measurement instruments thatwill best meet one's needs.

In "Refinements to the Lubben Social NetworkScale: The LSNS-R," James Lubben, MelanieGironda, and Alex Lee describe in detail theprocess of successive changes and analysesthereof through which they modified the LubbenSocial Network Scale to improve it in variousways. The process involved performingprincipal component analysis to determinewhich items in the instrument contribute tounderlying factors being measured by theinstrument, and thus which items best measurewhat they want the instrument to measure. Thisanalytical process resulted in a "cleaner andmeaner" version of the LSNS, the LSNS-R. Inaddition, by detailing the process of successivemodifications and mathematical analyses thatproduced the LSNS-R, the authors outline amodel for instrument revision that is broadlyapplicable.

In a piece titled "Self-Estimated Intelligence,"Adrian Furnham from London Universityreviews literature pertaining to estimation ofintelligence by oneself and others, and then

Vol. 7, No.2, Winter 2002

compares, contrasts, and offers explanations forthe reported findings. The literature he reviewedshows that there is a gender difference inintelligence estimation -- females tend tounderestimate their IQ and that of other females,while males, on the other hand, tend tooverestimate their IQ and that of other males,and that this gender difference exists acrosscultures and nationalities, across socio-economic classes, and across age groups. Hethen explores possible explanations for thegender difference and suggests possible lifeconsequences of over- and under-estimatingone's abilities.

Our regular contributor, Fred Bryant, presents aset of guidelines for choosing measurementtools, "Ten Commandments for Selecting Self-Report Instruments." Some of the guidelinesmay seem to be common-sense. Many readerswill be familiar with their basic content. Thoseexperienced in instrument selection may seemistakes they've made in the past (and vowednever to make again). In any case, all of ourreaders, from neophytes to seasoned instrumentusers, will find value in the piece -- if notsomething they hadn't known previously, then,certainly, practical reminders of the numeroussubtleties, cautions, and pitfalls that should beattended to while selecting a measurementinstrument.

Address comments and suggestions to TheEditor, The Behavioral Measurement Letter,Behavioral Measurement Database Services, POBox 110287, Pittsburgh, PA 15232-0787. Ifwarranted and as space permits, yourcommunication may appear as a letter to the

The Behavioral Measurement Letter

Page 2: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Introduction (continued from page 1)

editor. Whether published or not; your feedbackwill be attended to and appreciated.

HaPI reading . . .

We also accept short manuscripts for The BML.Submit, at any time, a brief article, opinion pieceor book review on a BML-relevant topic to TheEditor at the above address. Each submissionwill be given careful consideration for possiblepublication.

Al K. DeRoy, Editor

Staff

Director Evelyn Perloff, PhDNewsletter Editor Al K. DeRoy, PhDDatabase Coordinator Barbara L. Brooks, MLSIndexerlReviewer JacquelineA.Reynolds,BAAnalyst Janelle Price-Kelly, BAComputer Consultant Alfred A. Cecchetti, MSMeasurement Consultant Fred B. Bryant, PhDInformation Specialist

Consultant Ellen Detlefson, PhDBusiness Services Manager . Diane CadwellCopy Assesser Tracey Handlovic, BAFile Processor Betty Hubbard

Phone: (412) 687-6850 Fax: (412) 687-5213

E-mail: [email protected]

Since the measuring device has beenconstructed by the observer ...

we have to remember that what we observeis not nature in itself but nature

exposed to our method of questioning.

Werner Karl Heisenberg

The Behavioral Measurement Letter

Refinements to the Lubben SocialNetwork Scale: The LSNS-R

James Lubben, Melanie Gironda, andAlex Lee

Increased interest in social support networksduring the past decade spawned the development ofmany new social network scales. However,psychometric properties of these instruments aregenerally inadequately reported and there are fewreported attempts to refine them. The workpresented here addresses both of these concerns inexamining the psychometric properties of theLubben Social Network Scale (Lubben, 1988) andcreating a revised version, the LSNS-R. Inaddition, the analytic plan of procedure used in thiswork is a model that could be employed to evaluateand refine other social and behavioral measures.

O'Reilly (1988) lamented that most socialsupport network assessment instruments haveinadequate clarity of definition and also lackreported reliability and validity statistics.Similarly, a common criticism of many studiesthat examine social support networks is that theyemploy instruments with unknown orunreported psychometric properties(Winemiller, Mitchell, Sutliff, & Cline, 1993).The work discussed below addresses theseconcerns 'by examining psychometric propertiesof the Lubben Social Network Scale (Lubben,1988) and putting forth a refinement of theLSNS, the LSNS-R (i.e., the "revised" versionof the LSNS). Given growing consensus on theimportance of social support networks to healthand well-being, developing and using consistenttools to conduct assessments of such networksare becoming ever more crucial togerontological research and geriatric practice(House, Landis, & Umberson, 1988; Steiner,Raube, Stuck, Aronow, Draper, Rubenstein, &Beck, 1996; Glass, Mendes, de Leon, Seeman,& Berkman, 1997).

Lubben Social Network Scale

The Lubben Social Network Scale (LSNS;Lubben, 1988) has been used in a wide array ofstudies, and in both research and practicesettings, since it was first reported more than adecade ago (Lubben; Lubben, Weiler, & Chi,1989; Siegel, 1990; Mor-Barak & Miller, 1991;Mor-Barak, Miller, & Syme, 1991; Potts,Hurwicz, Goldstein, & Berkanovic, 1992;

2 Vol. 7, No.2, Winter 2002

Page 3: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale:The LSNS-R (continued)

Hurwicz & Berkanovic, 1993; Rubenstein, L. etal., 1994; Rubenstein, R., Lubben, & Mintzer,1994; Dorfman, Walters, Burke, Hardin, &Karanik, 1995; Luggen & Rini, 1995; Lubben &Gironda, 1997; Okwumabua, Baker, Wong, &Pilgrim, 1997; Mor-Barak, 1997; Gironda,Lubben, & Atchison, 1998; Chou & Chi, 1999;Martire, Schulz, Mittelmark, & Newsom, 1999;Mistry, Rosansky, McGuire, McDermott, &Jarvik, 2001). Further, the LSNS has beenemployed in a variety of ways, including use asa control variable as well as an outcome variablein health and social science studies. It also hasbeen used as a screening tool for health riskappraisals and as a "gold" standard by which toevaluate other social network assessmentinstruments.

The LSNS was developed as an adaptation of theBerkman-Syme Social Network Index (BSSNI;Berkman & Syme, 1979). Whereas the LSNSwas developed specifically for use amongelderly populations, the BSSNI was initiallydeveloped for a study of an adult population thatpurposefully excluded older persons. The LSNSis based on items borrowed from questionnairesused in the original epidemiological study forwhich the BSSNI was constructed. However, theLSNS excluded BSSNI items dealing withsecondary social relationships (viz., group andchurch membership) because theseorganizational participation items showedlimited variance when used with olderpopulations, especially those having largenumbers of frail elderly persons (Lubben, 1988).In contrast, the LSNS elaborated on an array ofitems dealing with the nature of relationshipswith family and friends, in view of the growingbody of empirical data suggesting that thestructure and functions of kinship and friendshipnetworks are particularly salient to the healthand well-being of older persons.

The LSNS total scale score is computed bysumming ten equally weighted items thatquantify structural and functional aspects of

Vol. 7, No.2, Winter 2002

primary social relationships. Scores for eachLSNS item range from zero to five, with lowerscores indicating smaller networks. The scalehas been found to have relatively good internalconsistency among a widely diverse set of studypopulations (ex = 0.70). Factor analyses on theLSNS suggest that it measures three differenttypes of social networks: family networks,friendship networks, and interdependentrelationships (Lubben, 1988; Lubben &Gironda, 1997).

Methodology

The main purpose of the work presented here isto address deficits that became apparent in theoriginal LSNS as it was used with diversepopulations over the past decade. But becausethe LSNS was found to have relatively stablereliability and validity across this wide array ofsettings, any proposed modifications were not tojeopardize the relatively strong psychometricproperties of the original LSNS.

Reliability is a fundamental issue inpsychological measurement (Nunnally, 1978).One important type of measurement reliability isinternal consistency, i.e., the extent to whichitems within' a scale relate to the latent variablebeing measured (DeVellis, 1991; Streiner &Norman, 1995). Cronbach's (1951) coefficientalpha was chosen to examine the internalconsistency of the LSNS and modificationsdesigned to improve upon the original version.The acceptable range of coefficient alpha valuesemployed here was 0.70 to 0.90 (Nunnally;DeVellis) because assessment instruments withreliability scores higher than 0.90 are likely tosuffer from excessive redundancy, whereas thosewith alpha less than 0.70 are likely to beunreliable (Streiner & Norman). A further test ofitem homogeneity used was the item-total testscore correlation (DeVellis; Streiner &Norman). Here acceptable values of the item-total score correlation were 0.20 and greater(Streiner & Norman).

Principal component analysis looks forunderlying (latent) components that account formost of the variance of a scale (Stevens, 1992).Principal component analysis with varimaxrotation was used here to explore the componentstructure of various versions of the LSNS to see

3 The Behavioral Measurement Letter

Page 4: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale:The LSNS-R (continued)

if the modified versions conformed in actualityto the hypothesized structure. Although moresophisticated methods exist to examine factor orlatent variable structures, such as maximumlikelihood factor analysis and confirmatoryfactor analysis, many scholars contend thatprincipal component analysis is both adequateand yet more practical than more sophisticatedtechniques, for principal component analysis ismathematically easier to manage, easier tointerpret, and yields results similar to those frommaximum likelihood factor analysis (Nunnally,1978; Stevens). The size of the sample used inthe analyses discussed below is adequate toconduct principal component analysis accordingto general sample size guidelines (Stevens;Guadagnoli & Velicer, 1988).

Four Objectives Used in Refining the LSNS

The work to refine the LSNS has four principalobjectives. One was to distinguish between andbetter specify the nature of family and friendshipsocial networks. A second was to replace, wherefeasible, items in the original LSNS that havesmall statistical variance. A third objective wasto disaggregate "double-barreled" questions.The fourth was to produce a parsimoniousinstrument to encourage and facilitate its use inresearch and practice settings where timeconstraints or other issues preclude using longersocial support network instruments.

With regard to the first two objectives above, itshould be noted that social and behavioralmeasures are purposely designed to discriminateamong groups for a certain construct, and solack of variation within a given item limits ascale's ability to identify and discriminateamong variations (DeVellis, 1991; Streiner &Norman, 1995). Thus, eliminating items withlimited statistical variance generally increases ascale's overall sensitivity and specificity, andthat in turn improves its effectiveness inmeasuring constructs of interest (McDowell &Newell, 1987; DeVellis).

Double-barreled items are those in which twodifferent questions are contained in one item.Such items often confuse respondents because

The Behavioral Measurement Letter

they are not sure as to which aspect of thedouble-barreled question they should respond(DeVellis, 1991; Streiner & Norman, 1995).Disaggregating double-barreled questions as perthe third objective not only helps respondents inanswering, it allows researchers to determine theextent to which each part of the original questionhelps to define a particular construct.

Plan and Procedures

Production of the LSNS- R progressed along aseries of four analytical steps to address theobjectives stated above. In each step, alphareliability coefficients and the results of a seriesof principal component factor analyses wereexamined to determine whether and the extent towhich items corresponded to the latent structuralcomponents of family networks and friendshipnetworks.

The four steps are summarized in Figure 1. Inthe first step, reliability statistics were obtainedfor the original LSNS administered to thesample described above. These values thenserved as reference points for comparison withvalues obtained for subsequent modifications.

In the second analytical step, two items from theoriginal LSNS scale - L9 C'Helps others withvarious tasks") and L 10 C'Livingarrangements") - were dropped because theydemonstrated limited response variation amonga number of sample groups including the presentone. Furthermore, neither of these items helps todistinguish between family networks andfriendship networks better than the other itemsin the original LSNS.

The L9 item was originally included in theLSNS in part because social exchange theorysuggests that a reciprocal social relationship isstronger than one that is unidirectional (Jung,1990; Burgess & Huston, 1979). Thus, ratherthan only capturing what others do for the olderperson being assessed, it is desirable to includeitems that also assess what the older person doesfor other people, i.e., items should be included toassess reciprocity of social support withinkinship and friendship networks. Moreover, inpast studies L9 generally demonstratedinsufficient item variance and thus was a ,goodcandidate for elimination or replacement.

4 Vol. 7, No.2, Winter 2002

Page 5: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale:The LSNS-R (continued)

The LlO item on living arrangements also hadnot worked out well over time. When theoriginal LSNS was constructed, both livingarrangements and marital status were commonitems included in measures of social supportnetworks. It therefore seemed entirelyappropriate to include an item merging thesetwo related constructs. However, the LlO itemhas been the worst performing item on the LSNSacross different settings. Part of the problem hasbeen scoring it, which is constrained by thelimited number of response options available aswell as by disagreements among scorers inassigning ordinal weights to specific responseoptions. Perhaps most important, marital statusand living arrangements are generally notmalleable nor appropriate for intervention, soitems concerning them should not be included inany case.

In the third analytical step, two "double-barreled" questions were disaggregated. The twoitems, L3 (relatives) and L4 (friends) ask, "Howmany (relatives) (friends) do you feel close to?That is, how many of them do you feel at easewith, can talk to about private matters, or callon for help?" In this step, L3 and L4 were eachrecast as two distinct questions. One asks, "Howmany (family members) (friends) do you feel atease with such that you can talk to them aboutprivate matters?" whereas the other asks, "Howmany (family members) (friends) do you feelclose to such that you can call on them forhelp?" The first of these substitute questionsexamines somewhat intangible or expressedsupport, whereas the other taps into moretangible support, such as help with running anerrand. Both types of support have beensuggested as important aspects of social supportnetworks (Litwak, 1985; Sauer & Coward,1985).

In the fourth step, items that identify both thetargets and sources of respondents' confidantrelationships were constructed and tested. Herethe two confidant relationship items in theoriginal LSNS (L7 and L8) were recast todistinguish between confidant relationships withfamily members and those with friends. Thesechanges recognize that confidant relationships

Vol. 7, No.2, Winter 2002

with family members may serve differentfunctions than confidant relationships withfriends (Keith, Hill, Goudy, & Power, 1984).The final result of the four-step process is arevised version of the LSNS, the LSNS-R.

Plan of AnalysisStep 1:Analyze original LSNSOriginal LSNS Items:

Ll Family: Number seen or heard from per monthL2 Family: Frequency of contact with family

member most in contactL3 Family: Number feel close to, talk about private

matters, call on for helpL4 Friends: Number feel close to, talk about private

matters, call on for helpL5 Friends: Number seen or heard from per monthL6 Friends: Frequency of contact with friend most

in contactL7 Confidant: Has someone to talk to when have

important decision to makeL8 Confidant: Others talk to respondent when they

have important decision to makeL9 Helps othersLl 0 Living arrangements

Step 2: Eliminate items with limited variationItems eliminated: L9 and Ll 0

Step 3: Uncouple double-barreled questionsItems modified: L3 and L4 each split into twoseparate questions

L3A Family: Number feel at ease with whom youcan talk about private matters

L3B Family: Number feel close to whom you cancall on for help

L4A Friends: Number feel at ease with whom youcan talk about private matters

L4B Friends: Number feel close to whom you cancall on for help

Step 4: Distinguish between source and targetconfidant relationships with family and friendsItems modified: L7 and L8 each split into separatequestions for family and friends

L7A Family: Respondent functions as confidant toother family members

L7B Friends: Respondent functions as confidant tofriends

L8A Family: Respondent has family confidantL8B Friends: Respondent has friend who is a

confidant.

Data Source

The data are from a survey of older white, non-Hispanic Americans in Los Angeles County,California done between June and November1993. A self-weighting, multistage probability

5 The Behavioral Measurement Letter

Page 6: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale:The LSNS·R (continued)

sample was selected from 861 census tracts inthe area in which the white, nonHispanicpopulation exceeded any other single racial orethnic group. This sampling strategy insured ahigh level of homogeneity in the sample.

The first three sampling stages were: (a) randomselection of tracts, (b) random selection ofblocks, and (c) random selection of householdswithin selected blocks. Households were thencontacted by telephone to determine the age andethnicity of household members. All white, non-Hispanic persons aged 65 or over in eachhousehold were thus identified and thenpotential participants were randomly selectedfrom this pool. Of the 265 older persons thusselected, 76 percent agreed to be interviewed,resulting in a final sample of 201. The sampleincluded 130 women (65%) and 71 men (35%)and had a mean age of 75.3. Additional detailson the sample are reported elsewhere (Villa,Wallace, Moon, & Lubben, 1997; Moon,Lubben, & Villa, 1998; Pourat, Lubben,Wallace, & Moon, 1999; Pourat, Lubben, Yu,&Wallace, 2000).

Results

Cronbach Alpha Values for the Productsof Each Step

Table 1 presents internal consistency values forthe products of the four analytical steps. TheCronbach alpha value for the original 10-itemLSNS scale administered to the present sample(ex = 0.66) is slightly lower than those previouslyreported (Lubben, 1988; Lubben & Gironda,1997) and below the desired standard forinternal consistency. The Cronbach alpha valuesincreased in each subsequent step in theanalysis, indicating that each successivemodification contributed to improving the finalproduct's internal consistency. Although thevariant produced in Step 2 has two items lessthan the original LSNS, the alpha value isslightly higher than that for the original LSNS,suggesting that dropping items L9 and LIO wasappropriate.

Further, the greatly improved alpha valueobtained for the variant produced in Step 3

The Behavioral Measurement Letter

indicates that disaggregation of "double-barreled" questions (items L3 and L4) was quitebeneficial. In Step 4, the product resulting fromincluding items that distinguish between sourceand target confidant relationships with familymembers and those with friends had furtherincreased internal consistency.

Table 1Cronbach's Alpha Value by Product of EachStep of Analysis

S~ a

1. Original LSNS; LO-item scale .662. Items L9 and LIO dropped; 8-item scale results .673. L3 & L4 split; l O-item scale results .734. L7 & L8 split; 12-item scale (LSNS-R) results .78

Factor Analysis

Principal component factor analyses wereperformed for each step of the revision toexplore for latent factors and to determinewhether the final modified version has latentstructural components corresponding to bothkinship and friendship networks. The number offactors found in each step of the analysis wasdetermined by considering factors witheigenvalues over one (Kaiser, 1960) and byidentifying the elbow in the screenplot tests(Cattell, 1966). The factor loadings weresubjected to varimax rotation.

Table 2 shows the rotated factor matrix for theoriginal LSNS administered to the presentsample. Although previous studies have reportedthree factors (Lubben, 1988; Lubben & Gironda,1997), the rotated factor structure for the LSNShere showed a two-factor solution, with onefactor consisting largely of family-related itemsand the other consisting primarily of itemsconcerning friendships. However, the former,i.e., the family factor, also incorporates itemsconcerning confidant relationships and the"helps others" items. Both confidant itemsclearly load onto the family factor in this step,but the "living arrangements" item cross-loadson both the family and friend factors.

6 Vol. 7, No.2, Winter 2002

Page 7: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale: The LSNS-R (continued)

Table 2Step 1: Original 1O-Item LSNS Factor Matrix

ItemFamily FriendFactor Factor.7298 .1242.6743 .0377.6633 -.0639.6629 -.0761.5702 -.0269.5391 .2047

.2606 .7807

.2072 .7520-.1082 .5225-.3490 .4461

L3 Family: discuss private matterslcall on for helpL8 Is confidantLl Family: number in contactL2 Family: frequency of contact with family member most in contactL9 Helps othersL7 Has confidant

L4 Friends: discuss private matters/call on for helpL5 Friends: number in contactL6 Friends: frequency of contact with friend most in contactLlO Living arrangements

Table 3Step 2: Factor Matrix after Eliminating Items L9 and L10

ItemFamilyFactor

FriendFactor

Family: discuss private matterslcall on for helpFamily: frequency of contact with family member in contactFamily: number in contactIs confidantHas confidant

Friends: discuss private matters/calIon for helpFriends: number in contactFriends: frequency of contact with friend most in contact

L3L2LlL8L7

L4L5L6

.7722

.7393

.6807

.6455

.5622

.2105

.1514-.1230

.1255-.1103-.0218.0850.2040

.8162

.7810

.5558

Table 4Step 3: Factor Matrix after Disaggregating Items L3 and fA

Family Friend ConfidantItem Factor Factor Factor

Ll Family: number in contact .8133 .0029 .0604L3A Family: discuss private matters .8020 .0959 .2442L3B Family: call on for help .7775 .1226 .2545

L4B Friends: call on for help .0522 .7984 .1707L5 Friends: number in contact .1873 .7653 -.0319L4A Friends: discuss private matters .1632 .7504 .0523L6 Friends: frequency of contact -.1817 .4877 .0685

L7 Has confidant .0425 .1524 .8276L8 Is confidant .2365 .1647 .6689L2 Family: frequency of contact .4253 -.1524 .6166

"confidant" items were found to clearly load onthe family factor. No heavy cross-loading wasfound for the scale variant produced in this step.

In the second step, items L9 and LlO wereeliminated as planned due to their general poorperformance in previously discussed studies.Similar problems are demonstrated in thecurrent study by the heavy cross-loading foundfor LlO in Step 1. As in Step 1, a two-factorstructure was found (Table 3) and the

Table 4 shows the factor structure found in Step3. In this step, the double-barreled items L3 andL4 ("talk about private matters" and "callon

Vol.7, No.2, Winter 2002 7 The Behavioral Measurement Letter

Page 8: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

.~-----~----~ -.----

Refinements to the Lubben Social Network Scale: The LSNS-R (continued)

Table 5Step 4: LSNS-R Factor Matrix

ItemFriendFactor B

FamilyFactor

FriendFactor A

L3BL8AL7AL3AL2Ll

L8BL7BL6

L5L4BL4A

Family: call on for helpFamily: has confidantFamily: is confidantFamily: discuss private mattersFamily: frequency of contact with family member most in contactFamily: number in contact

Friends: has confidantFriends: is confidantFriends: frequency of contact with friend most in contact

Friends: number in contactFriends: call on for helpFriends: discuss private matters

.2327-.0024-.0384.2320

-.1438.1565

.1279

.1140

.1856

.8467

.7663

.6028

.7600

.7402

.7358

.7345

.7134

.6712

.0487

.1907-.1428

.0915

.0839

.0922

-.0352.0848.2490

-.0292.0514

-.1576

.8800

.8488

.5493

.0594

.2711

.4920

for help") were disaggregated. Principalcomponent factor analysis performed hereindicated a three-factor solution, with itemsloading on a family factor, a friendship factor,and a confidant factor. Generally the factors areclean (i.e., for each item, there is predominantloading on one factor and little loading on theothers). However, the family "frequency ofcontact" item (L2) cross-loads on both thefamily and confidant factors.

Step 4 involved distinguishing family andfriends as both possible sources and possibletargets of confidant relationships, and resulted inthe LSNS-R. Principal component factoranalysis in this step (Table 5) revealed a single,clean family factor and two friendship factors.The friendship confidant items (L7B, L8B) andthe frequency of contact with a friend item (L2)constitute one of the friendship factors, while theremaining friendship items make up a secondfriendship factor. The item on being able to talkto a friend about private matters (L4A) loads onboth friendship factors.

Item-Total Scale Correlations

Item-total scale correlational analysis yieldedcoefficients ranging from 0.27 to 0.75,indicating that LSNS-R items are sufficientlyhomogeneous and without excessiveredundancy. All internal reliability coefficientsfell within the acceptable range suggested bySteiner and Norman (1995). The correlation

The Behavioral Measurement Letter

coefficient between the original LSNS andLSNS-R was 0.68.

Conclusion

As gerontologists and geriatricians begin toidentify means to increase active life expectancyrather than mere life expectancy, it is likely thatolder persons' social support networks will beshown to. be necessary to healthy aging. Thismeans that there is increasing need for a varietyof reliable and valid social support scales for usein research and practice settings. The workdiscussed here should be viewed as part of anongoing pursuit of such well-constructed socialintegration scales, for improved measures ofsocial support networks are essential tounderstanding better the reported link betweensocial integration and health. Such improvedknowledge thus will enhance futuregerontological research, geriatric care, and thequality of life of the elderly.

From applied research and clinical perspectives,there is growing pressure to develop short andefficient scales. Some elderly populations areunable to complete long questionnaires, andtime constraints in most clinical practice settingsnecessitate use of efficient and effectivescreening tools. Shorter scales require less timeand energy of both the administrator andrespondent. Thus, parsimonious and effectivescreening tools are needed that are acceptable toelders, researchers, and health care providers aswell.

8 Vol. 7, No.2, Winter 2002

Page 9: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale: TheLSNS-R (continued)

For more basic researchers, somewhat longerresearch instruments are desirable, becausehaving a larger number of items as well as betterclarity of concepts generally contributes to ascale's reliability and facilitate the analysis andappraisal of subtle differences. But rather thanattempting to design a single social supportnetwork scale for use with all elderlypopulations and for all research purposes, itseems far more practical to design measurementinstruments for specific populations along withclear indications of how they should be used(Mitchell & Trickett, 1980). Use of such well-targeted social support network assessmentinstruments will yield more valid researchresults than use of less well-targetedmeasurement tools.

The original LSNS was designed specifically foran elderly population. Although it has provenadaptable to a variety of settings, somedeficiencies have been noted over the past tenyears of use. The LSNS-R addresses theseproblems, resulting in an improved measure ofsocial support networks. The refinements aretheory driven and involved reworking items inthe original LSNS so that the revised scale canbetter measure the distinct aspects of kinshipand friendship networks (Lubben & Gironda, inpress). An abbreviated version of the LSNS-Rhas been developed and is reported elsewhere(Lubben & Gironda, 2000). This six-item scale(the LSNS-6) can be especially suitable inpractice settings as a screening tool for socialisolation or for more general use in thoseresearch settings where longer social supportnetwork scales cannot be accommodated. Forthose social and behavioral researchers desiringmore extensive inquiry into the nature of socialrelationships of the elderly, an 18-item versionof the LSNS has also been developed (Pourat, etal., 1999; Pourat, et al., 2000). The majoradvantage of the LSNS-18 over the LSNS-R isthat the former distinguishes friendship ties withneighbors from those with friends who do notlive in close proximity to the respondent. Suchdistinctions are desirable for exploring agrowing number of social and behavioralresearch questions regarding the functioning ofsocial support networks.

Vol. 7, No.2, Winter 2002

In summary, development and validation ofsocial support network instruments arecumulative and ongoing processes. They requiretesting and retesting with diverse populations, invarious research and practice settings, and usingboth psychometric and practical standards toassess their actual utility. In addition, futureanalyses of these scales should includeassessment of their sensitivity to variousdifferences within and between groups, forexample, cultural and socio-demographicdifferences, or differences in levels of health andfunctional status that might affect responsepatterns.

The four-step process described above continuesin the tradition of instrumentation refinement,tailoring new measurement tools to addressspecial needs as well as offering a revisedversion of the LSNS incorporatingmodifications that greatly enhance itspsychometric properties. This work also offers aparadigm that could be employed to evaluateand refine other social support networkmeasurement tools.

References

Berkman, L.P., & Syme, S.L. (1979). Social networks,host resistance, and mortality: A nine-year follow'-upstudy of Alameda County residents. American Journal ofEpidemiology, 109, 186-204.

Burgess, R.I. & Huston, T.L. (Eds.) (1979). Socialexchange in developing relationships. New York:Academic Press.

Cattell, R.B. (1966). The meaning and strategic use offactor analysis. In R.B. Cattell (Ed.), Handbook ofmultivariate experimental psychology. Chicago: RandMcNally.

Chou, K.L. & Chi, I. (1999). Determinants of lifesatisfaction in Hong Kong Chinese elderly: A longitudinalstudy. Aging and Mental Health, 3:328-335.

Cronbach, L.I. (1951) Coefficient alpha and the internalstructure of tests. Psychometrika, 16,297-334.

DeVellis, R.P. (1991). Scale development: Theory andapplications. Newbury Park, CA: Sage.

Dorfman, R., Lubben, J.E., Mayer-Oakes, A., Atchison,K.A., Schweitzer, S.O., Dejong, P., et aI., (1995).Screening for depression among the well elderly. SocialWork, 40, 295-304.

9 The Behavioral Measurement Letter

Page 10: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale: TheLSNS-R (continued)

Gironda, M.W, Lubben, J.E., & Atchison, KA. (1998).Social support networks of elders without children.Journal of Gerontological Social Work, 27, 63-84.

Glass, TA., Mendes de Leon, C.P., Seeman, TE.,Berkman, L.P. (1997). Beyond single indicators of socialnetworks: A LISREL analysis of social ties among theelderly. Social Science and Medicine, 44, 1503-1507.

Guadagnoli, E., & Velicer, W (1988). Relation of samplesize to the stability of component patterns. PsychologicalBulletin, 103, 265-275.

House, J.S., Landis, KR., & Umberson, D. (1988). Socialrelationships and health. Science, 241, 540-545.

Hurwicz, M.L., Berkanovic, E. (1993). The stress processin rheumatoid arthritis. Journal of Rheumatology, 20, 11.

Jung, J. (1990). The role of reciprocity in social support.Basic and Applied Social Psychology, 11, 243-253.

Kaiser, H.P. (1960). The application of electroniccomputers to factor analysis. Educational andPsychological Measurement, 20, 141-151.

Keith, P.M., Hill, K, Goudy, W.J., & Power, E.A. (1984).Confidants and well-being: A note on male friendship inold age. Gerontologist, 24, 318-320.

Litwak, E. (1985). Helping the elderly: Thecomplementary role of informal networks and formalsystems. New York: Guilford.

Lubben, J.E. (1988). Assessing social networks amongelderly populations. Family Community Health, 11, (3),42-52.

Lubben, J.E., & Gironda, M. (1997). Social supportnetworks among older people in the United States. LitwinH. (Ed.) The Social Networks of Older People. Wesport,CT: Praeger.

Lubben, J.E. & Gironda, M.W (2000). Social SupportNetworks. Osterweil, D., Beck, J., & Brummel-Smith, K(Eds.) In Comprehensive Geriatric Assessment: A Guidefor Healthcare Providers, New York: McGraw-Hill.

Lubben, J.E., & Gironda, M.W (in press). Centrality ofsocial ties to the health and well-being of older adults. InB. Berkman & L. Harooytan (Eds.). Gerontological socialwork in the emerging health care world. New York:Springer.

Lubben, J.E., Weiler, P.G., Chi, I. (1989). Gender andethnic differences in the health practices of the elderlypoor. Journal of Clinical Epidemiology, 42, 725-733.

The Behavioral Measurement Letter

Luggen, A.S., & Rini, A.G. (1995). Assessment of socialnetworks and isolation in community-based elderly menand women. Geriatric Nursing, 16, l79-181.

Martire, L.M., Schulz, R., Mittelmark, M.B., & Newsom,J.T (1999). Stability and change in older adults' socialcontact and social support: The Cardiovascular HealthStudy. Journals of Gerontology: Series B: Psychologicaland Social Sciences, 54B, S302-S311.

McDowell, I., & Newell, e. (1987). Measuring health: Aguide to rating scales and questionnaires. New York:Oxford University Press.

Mistry, R., Rosansky, J., McGuire, J., McDermott, c.,Jarvik, L., and the UPBEAT Collaborative Group (2001).Social isolation predicts re-hospitalization in a group ofolder American veterans enrolled in the UPBEATProgram. International Journal of Geriatric Psychiatry,16,950-959.

Mitchell, R., & Trickett, E. (1980). Social networks asmediators of social support: An analysis of the effects anddeterminants of social networks. Community MentalHealth Journal, 16, 27-44.

Moon, A., Lubben, J.E., & Villa, VM. (1998). Awarenessand utilization of community long term care services byelderly Korean and nonHispanic White Americans.Gerontologist, 38, 309-316.

Mor-Barak, M.E. (1997). Major determinants of socialnetworks in frail elderly community residents. HomeHealth Care Services Quarterly, 16, 121-137.

Mor-Barak, M.E., Miller, L.S., & Syme, L.S. (1991).Social networks, life events, and health of the poor frailelderly: A longitudinal study of the buffering versus thedirect effect. Family Community Health, 14, (2), 1-13.

Mor-Barak, M.E., & Miller, L.S. (1991). A longitudinalstudy of the causal relationship between social networksand health of the poor frail elderly. Journal of AppliedGerontology, 10,293-310.

Nunnally, J.e. (1978). Psychometric theory (2nd ed.).New York: McGraw-Hill.

Okwumabua, J.O., Baker, P.M., Wong, S.P., & Pilgrim,B.O. (1997). Characteristics of depressive symptoms inelderly urban and rural African Americans. Journals ofGerontology: Biological Sciences and Medical Sciences.52A, M241-M246.

O'Reilly, P. (1988). Methodological issues in socialsupport and social network research. Social ScienceMedicine, 26, 863-873.

Potts, M.K., Hurwicz, M.L., Goldstein, M.S., &Berkanovic, E. (1992). Social support, health-promotivebeliefs, and preventive health behaviors among theelderly. Journal of Applied Gerontology, 11, 425-440.

10 Vol. 7, No.2, Winter 2002

Page 11: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Refinements to the Lubben Social Network Scale: TheLSNS-R (continued)

Pourat, N., Lubben, J., Yu, H., & Wallace, S. (2000).Perceptions of health and use of ambulatory care:Differences between Korean and White elderly. Journal ofAging and Health, 12, (1), 112-134.

Pourat, N., Lubben, J., Wallace, S., & Moon (1999).Predictors of use of traditional Korean healers amongelderly Koreans in Los Angeles. Gerontologist, 39, 711-719.

Rubenstein, L.Z., Aronow, H.U., Schloe, M., Steiner, A.,Alessi, C.A., Yuhas, K.E., Gold, M., Kemp, K.,Nisenbaum, R, Stuck, A., & Beck, r.c. (1994). A home-based geriatric assessment: Follow-up and healthpromotion program: Design, methods, and baselinefindings from a 3-year randomized clinical trial. Aging, 6,105-120.

Rubenstein RL., Lubben, J.E., & Mintzer, J.E. (1994).Social isolation and social support: An appliedperspective. Journal of Applied Gerontology, 13, 58-72.

Sauer, W'L, & Coward, RT. (1985). Social supportnetworks and the care of the elderly: Theory, research andpractice. New York: Springer.

Siegel, J .M. (1990). Stressful life events and use ofphysician services among the elderly: The moderating roleof pet ownership. Journal of Personality and SocialPsychology, 58, 1081-1086.

Stevens, J. (1992). Applied multivariate statistics for thesocial sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Steiner, A., Raube, K., Stuck, A.E., Aronow, H.U., Draper,D., Rubenstein, L.Z., & Beck, J. (1996). Measuringpsychosocial aspects of well-being in older communityresidents: Performance of four short scales, Gerontologist,36,54-62.

Streiner, D.J., & Norman, G.R. (1995). Healthmeasurement scales: A practical guide to theirdevelopment and use (2nd ed.). New York: OxfordUniversity Press.

Villa, Y.M., Wallace, S.P., Moon, A., & Lubben, J.E.(1997). A comparative analysis of chronic diseaseprevalence among older Koreans and nonHispanic Whites.Journal of Family and Community Health, 20, 1-12.

Winemiller, D.R., Mitchell, M.E., Sutliff, J., & Cline, D.1.(1993). Measurement strategies in social support: Adescriptive review of the literature. Journal of ClinicalPsychology, 49, 638-648.

Vol. 7, No.2, Winter 2002

James Lubben, DSW, MPH is Professor ofSocial Welfare and Urban Planning at theUniversity of California, Los Angeles(UCLA). Both his DSW and MPH are fromthe University of California, Berkeley. He isPrincipal Investigator for the HartfordDoctoral Fellows Program in GeriatricSocial Work, a program administered bythe Gerontological Society of America, anda consultant to the World HealthOrganization-Kobe Centre on health andwelfare systems development for agingsocieties. Dr. Lubben's research examinessocial behavioral determinants of vitality inold age, with a particular focus on the rolesof social support networks.

Melanie Gironda, PhD, MSW is a Lecturerin the Department of Social Welfare atUCLA where she teaches courses on socialgerontology and research on aging. Shereceived both her PhD and MSW fromUCLA. Dr. Gironda is Deputy ProgramDirector of the Hartford Doctoral FellowsProgram in Geriatric Social Work. Herresearch examines loneliness in variouspopulations of the elderly, with a specialfocus on the nature of social supportnetworks of older adults without children.

Alex E. Y. Lee, PhD, MSW is AssistantProfessor in the Department of Social Workand Psychology at the National Universityof Singapore where he teaches courses onsocial work and gerontology. He receivedboth his PhD and MSW from UCLA. Hiscurrent research concerns social servicedelivery systems for the elderly and family-based gerontological counseling for Asianelders.

In small proportions, we just beauties see,And in short measures life may perfect be.

Ben Jonson

11 The Behavioral Measurement Letter

If~~~~~,&,F

Page 12: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence equality with their mothers, men with theirfathers and men superior to their mothers.Mothers therefore come out as inferior tofathers. This pattern has been consistent eachyear." Beloff argued that the modesty traininggirls receive in socialization accounts for thesedata.

Adrian Furnham

Do beliefs about one's ability affect scores on anability test? Is it important to believe that you arebright to do well on an abilitylintelligence test?What do people think about their ownintelligence? How do they estimate their ownintelligence and that of their relatives? Whatdetermines their self-estimates?

Beloff's research stimulated others to replicatethese gender-related differences in self-estimated intelligence (Bennett, 1996, 1997,2000; Byrd & Stacey, 1993; Furnham & Rawles,1995, 1999). These studies were in four relatedresearch areas: (a) studies to replicate andexplain these differences in various countriesand relate nationallcultural variables to them; (b)research to see if these gender-relateddifferences can be replicated not only for overall(general) intelligence, but also for more specificand multiple intelligences; (c) work to examinewhether these differences occur in estimationsof the intelligence of others, notably male andfemale members of one's immediate family

Studies of self-estimated intelligence date backnearly a quarter of a century (Hogan, 1978).More recently Beloff (1992) provoked a greatdeal of interest in the effects of genderdifferences on self-estimated intelligence.Among her Scottish undergraduates, she found asix-point difference, with males estimating theirscore significantly higher than females.Additional work led her to conclude that "Theyounger women see themselves as intellectuallyinferior compared to young men ....Women see

Table IResults of Studies Where Participants Made an Overall IQ (g) Rating on Themselves and Others.

Study Women Men DifferenceBeloff (1992) - Scotland (N = 502) (N = 265)

Self 120.5 126.9 6.4Mother 119.9 118.7 -1.2Father 127.7 125.2 -2.5

Byrd & Stacey (1993) - New Zealand (N = 105) (N = 112)Self 12l.9 121.5 -0.4Mother 114.5 106.5 -9.0Father 127.9 122.3 -5.6Sister 118.2 110.5 -7.7Brother 114.1 116.0 1.9

Bennett (1996) - Scotland (N = 96) (N = 48)Self 109.4 117.1 7.7

Reilly & Mulhern (1995) - Ireland (N = 80) (N = 45)Self 105.3 113.9 8.6Measured 106.9 106.1 -0.8

Furnham & Rawles (1995) - England (N = 161) (N = 84)Self 118.48 123.31 6.17Mother 108.70 109.12 0.72Father 114.18 116.09 1.91

Furnham & Rawles (1996) - England (N = 140) (N = 53)Self 116.64 120.50 3.9

Furnham & Gasson (1998) - England (N = 112) (N = 72)Self 103.84 107.99 4.15Male child (1st child) 107.69 109.70 2.01Female child (1st child) 102.57 102.36 -0.21

Furnham & Reeves (1998) - England (N = 84) (N = 72)Self 104.84 110.15 5:31Male child (1st son) 116.09 114.32 -1.77

12 Vol. 7, No.2, Winter 2002The Behavioral Measurement Letter

Page 13: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence (continued)

(grandparents, parents, siblings, children); (d)research to examine the relationship betweenestimated intelligence and "actual" intelligencescores. Furnham (2001) reported on varioussuch studies, all but one of which showedsignificant gender differences in self-ratedoverall IQ ranging from 3.9 to 8.6 points.

Beginning with Beloff (1992), a number of thesestudies looked at estimates of relations' IQs(Furnham, Fong, & Martin, 1999). These studiesshowed that people believe there are cleargenerational effects in IQ - they believe thatthey are a little brighter than their mothers andcertainly much brighter than their grandparents,while parents tend to believe their children arebrighter than they are. In general, a half standarddeviation (6-8 points) difference was found inestimated IQ between generations. Genderdifferences were also found for estimated IQ ofrelatives (Flynn, 1987). Thus people believetheir grandfathers are brighter than theirgrandmothers, their fathers brighter than theirmothers, their brothers brighter than their sisters,and their sons brighter than their daughters.Interestingly for parents estimating theirchildren, the results were stronger for first-bornscompared to those born later, indicating thepossible working of the principle ofprimogeniture.

These results have been shown to be cross-culturally robust as the gender difference effecthas been demonstrated in Asia (Japan, HongKong), Africa (Uganda, South Africa), Europe(Belgium, Britain, but not Slovakia) andAmerica (Furnham, et al., 2002) as well as NewZealand (Furnham & Ward, 2001) and Iran(Furnham, Shahidi, & Baluch, 2002). Thus thereseems to be a robust and cross-culturally validfinding that there is a clear, consistent genderdifference in self-rating of overall intelligence,with males rating themselves and their malerelations higher than females rate themselvesand their female relations.

It is also interesting to note that it is very rare forpeople to rate their scores, or indeed those ofrelations, as below average «100 IQ points).For example, it was found that students tendedto believe their IQ to be about one-and-a-half

Vol. 7, No.2, Winter 2002

standard deviations above the mean (Furnham,Clark, & Bailey, 1999) - around 120, whereasnonstudent British adults believed their IQ is tobe around a half standard deviation above thenorm (Furnham, 2000).

Multiple Intelligence

Over the past decade there have been manyattempts to redefine intelligence and definedifferent types of intelligence. Thus we haveemotional intelligence as well as practicalintelligence, for example, but perhaps the ideathat has appealed most to lay people (notacademics, however) is Gardner's (1999)multiple intelligence theory. Gardner (1983)initially argued that there were seven types ofintelligence:• Verbal or linguistic intelligence (the ability

to use words),• Logical or mathematical intelligence (the

ability to reason logically, to solve numberproblems),

• Spatial intelligence (the ability to find one'sway around one's environment and formmental images),

• Musical intelligence (the ability to perceiveand produce pitch and rhythm),

• Body-kinetic intelligence (the ability tocarry out motor movements, for example, inperforming surgery, ballet dancing, orplaying sports),

• Interpersonal intelligence (the ability tounderstand people),

• Intrapersonal intelligence (the ability tounderstand oneself and develop a sense ofone's identity).

He suggested that the first two are those typesvalued at school, the next three are valuable inthe arts, and the latter two constitute a sort of"emotional intelligence."

At least eight studies have examined genderdifferences in estimates of Gardner's (1983,1999) seven multiple intelligences. The resultshave been consistent and give an important clueinto gender differences in overall performance.Most people rate their own interpersonal andintrapersonal intelligence as very high, about 1SD (standard deviation) above the mean, andtheir musical and body kinetic intelligence asstrictly average, around 100. Of the remainingthree types of intelligence in the Gardner model

13 The Behavioral Measurement Letter

Page 14: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence (continued)

- verbal, mathematical and spatial, genderdifferences were found only in mathematicaland spatial intelligence, particularly in spatialintelligence where females typically ratethemselves six to 10 points lower than do males.No gender differences were found in self-estimations of verbal intelligence (Furnham,2001).

Some studies have asked participants to estimatetheir overall (g, general intelligence) score firstand then their scores on the seven specificmultiple intelligences. This has made it possibleto regress simultaneously the seven multipleintelligence estimates on the overall intelligenceestimate. These regression studies show thatpeople believe that mathematical (logical), thenspatial, and then verbal intelligence are theprimary contributors to overall IQ. This, in turn,allowed for testing the hypothesis that layconceptions of intelligence are male normativein the sense that those abilities that men tend tobe better at are those that most people considerto be the essence of intelligence. These studiesshowed that lay people tend to confusemathematical/spatial and overall intelligence,thereby explaining the consistent genderdifferences in estimates of overall score. Thismeans that quite possibly the often-observed anddebated gender difference in spatial IQ accountsfor the difference in overall IQ.

Correlations Between Self-Estimatedand Test-Measured IQ

Are self-estimates accurate? In other words, isthe correlation between (valid) IQ test scoresand self-estimated scores high, low, or "on themark"? This is an important question becausesome psychologists have suggested that if thesescores correlate highly, self-estimates may serveas useful measures at a fraction of time, effort,and costs of administering and scoring IQ tests.

Various studies have found that correlations ofactual and self-estimated IQ are around r = 0.30and that therefore self-estimates cannot serve asproxy measures of IQ (Paulus, Lysy, & Yik,1998). One study, however, examined the effectof outliers and concluded that, if these outliersare removed, self-estimated and actual IQcorrelate highly (r > 0.90).

The Behavioral Measurement Letter

Reilly and Mulhern (1995) note that IQ-estimates research should not be based on the"assumption that gender differences at grouplevel represent a generalized tendency on thepart of either gender to either overconfidence orlack of confidence with regard to their ownintelligence" (Reilly & Mulhern). Studies doshow, however, a tendency for males tooverestimate and females to underestimate theirscore [?reference(s)] but this is related, in part,to the IQ test used.

Some researchers have tried to understand andthen increase the correlation between self-estimates and test scores by using more tests onbigger populations, yet the correlationsremained the same, as noted above, around r =0.30 (Borkenau & Liebler, 1993). Perhaps theexplanation should be sought in motivationalfactors that may be involved in the self-estimation of intelligence and that may lead toserious distortions in estimated scores. Thus aclose examination of the conditions andinstructions under which participants self-estimate their intelligence may provide a clue asto how they make their self-estimates. Forexample, if social norms and conventions in partdictate how people respond, then anonymity inresponding· may reduce the distortions in self-estimates.

From Whence the Differences?

Gender differences in self-estimated IQ need tobe explained because the issue is so frequentlydiscussed, there has been an academicconsensus for many years to the effect thatgender differences are minimal, and since beforethe Second World War, test constructors havebeen careful to produce IQ tests that minimizegender differences.

Essentially there are three positions on thegender differences of estimates issue (Fumham,20Ul). The "feminist" position was clearlyarticulated by Beloff (1992) in her first study,where she suggested that these differences areerroneous and simply due to attribution errors.One explanation she offers is a gender differencein modesty and humility. "Modesty-training isgiven to girls. Modesty and humility are likely tobe connected to the over estimates of womenand for women" (Beloff). Another explanation

14 Vol. 7, No.2, Winter 2002

Page 15: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence (continued)

she proposes is that because IQ is correlatedwith occupational grade and men tend to occupycertain higher status occupations more thanwomen (for political rather than ability reasonsin her view), females tend to believe that theyare less intelligent than males. There seem to beno direct data to test Beloff's assumptions, andthey look a little outdated, particularly in thecase of America where students still show thesame gender difference pattern despite theirmore equal socialization and occupationalchoice. However, unusual data from Slovakiamay be evidence for this position. Slovakianfemales awarded themselves higher overall (g)and verbal scores than did equivalent Belgianand British student groups (Furnham, Rakow,Samany-Schuller, & De Fruyt, 1999). Theauthors offered the following possibleexplanation for this unique group of confidentfemales: It may well be that under the pressureof socialist governments of Eastern Europe, therole of females in society was somewhatdifferent from that of women in capitalistWestern Europe - the former took a more activerole in the economy and were socializeddifferently in school. In fact, Slovakian societyattached high prestige to education and thegovernment made a consistent effort to improvethe position of women in society by encouragingthem to obtain educational qualifications,facilitating their employment in traditionallymale-dominated occupations, and evenmandating a given percentage of women in itsparliament. Another explanation they present isthat among the various nationalities studied, theSlovakian women had the most experience intaking intelligence tests and were thereforepresumably more likely to recognize that genderdifferences in intelligence are actually verysmall (Furnham, et al.).

Consequences of Gender Differencesin Ability Estimates

In a series of studies, Beyer (1990) demonstratedgender differences in self-expectations, self-evaluations, and performance on ability-relatedtasks. Her results support the "male hubris,female humility" thesis of Beloff. Further, sheargued that gender differences in self-evaluations affect expectancies of success

Vol. 7, No.2, Winter 2002

and failure, and, ultimately, performance onability-related tasks. Thus the importance ofstudies of self-estimated intelligence may lie notonly in exploring lay theories of intelligence, butalso in understanding the self-fulfilling nature ofself-evaluation of ability. "Because of theserious implications of under-estimations forself-confidence and psychological health moreattention should be devoted to the investigationof gender differences in the accuracy of self-evaluations. Such research will not onlyelucidate the underlying processes of self-evaluation biases and therefore be of theoreticalinterest but will also be of practical value bysuggesting ways of eliminating women's under-estimation of their performance" (Beyer).

Beyer (1998) reviewed studies concluding thatindividuals make poor self-evaluators. In onesuch study, the correlation between medicalstudents' self-rated knowledge and exam gradeswas found to be almost exactly zero (r = -0.01).In another study Beyer reviewed, the correlationbetween self-perceptions of one's own physicalattractiveness and judges' ratings of theirattractiveness was 0.22. Another study sheconsidered found a correlation of 0.32 betweenhow one judged ones own performance on a testof managerial-skill and how independent expertsrated it. "Interestingly, outside evaluators seemto be better assessors of a target's performancethat the target her/himself' (Beyer). Thus itseems that while self-estimates of intelligencemay not be useful as proxy IQ tests, variousstudies (Paulus et al., 1998) have shown thatthey could be very useful in explaining some ofthe variability in actual test results through theself-expectation effect that Beyer (1999)discussed.

The work of Dweck and colleagues isparticularly salient here (Dweck & Bempechat,1983; Mueller & Dweck, 1998). Dweck arguedthat lay persons tend either to believe thatintelligence is malleable (incremental laytheorists) or else it is fixed (entity lay theorists).These beliefs logically relate to goal setting,motivation, and attribution for success andfailure. Further, they relate to one's perceptionsof others. While these beliefs are not necessarilyrelated to actual ability, they can have powerful,even paradoxical, behavioral impacts. ThusMueller and Dweck pointed out if a young

15 The Behavioral Measurement Letter

Page 16: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence (continued)

person believes him/herself to be intelligent,entity theory predicts that they are likely to beless motivated to work hard to achieve theirgoals.

The interaction between beliefs about one's levelof intelligence on the one hand and thechangeability of intelligence on the other ispotentially very important. Thus entity laytheorists who believe themselves to be belowaverage on overall or specific intelligences mayshun intellectual tasks, set themselves low goals,and become depressed and ineffectual. Incontrast, entity lay theorists who rate themselvesway above average may be complacent and lazy,believing their "natural wit" sufficient to letthem succeed at most academic and workassignments. However, lay incrementalists whoare low self-raters may, if so moved, be preparedto undertake tasks or training that they believewill increase their intelligence. High self-ratingincrementalists, in contrast, may also believethat they need to work hard to maintain theirlevels of intelligence.

Conclusion

The measurement of intelligence has alwaysbeen a controversial topic. So, too, have thetopics of lay people's beliefs about their ownintelligence and the extent to which it can bechanged. Research relating to these topics hasyielded evidence of a "self-Pygmalion effect" inthat beliefs about the nature of intelligence andone's intelligence level can profoundly influencenot only how one performs on a test, but one'spersonal motivations in academic and worksettings throughout life (Spitz, 1999).

References

Beloff, H., (1992) Mother, father and me: Our IQ. ThePsychologist, 5, 309-311.

Bennett, M., (1996) Men's and women's self-estimates ofintelligence. Journal of Social Psychology, 136, 411-412.

Bennett, M., (1997) Self-estimates of ability in men andwomen. Journal of Social Psychology, 137,540-541.

Bennett, M., (2000). Gender differences in the se1f-estimation of ability. Australian Journal of Psychology,52,23-28.

The Behavioral Measurement Letter

Beyer, S., (1990). Gender differences in the accuracy ofself-evaluation of performance. Journal of Personality andSocial Psychology, 59, 960-970.

Beyer, S., (1998). Gender differences in self-perceptionand negative recall bias. Sex Roles, 38, 103-133.

Beyer, S., (1999). Gender differences in the accuracy ofgrade expectations and evaluations. Sex Roles, 41, 279-296.

Borkenau, P, & Liebler, A., (1993). Convergence ofstranger ratings of personality and intelligence with self-ratings, partner-ratings, and measured intelligence.Journal of Personality and Social Psychology, 65, 546-553.

Byrd, M., & Stacey, B., (1993). Bias in IQ perception. ThePsychologist, 6, 16.

Dweck, c., & Bempechat, J., (1983). Children's theoriesof intelligence: Consequences for learning. In S. Paris, G.Olson, and H. Stevenson (Eds.), Learning and motivationin the classroom. Hillsdale, NJ: Erlbaum.

Flynn, 1., (1986). Massive IQ games in 14 nations: WhatIQ tests really measure. Psychological Bulletin, 101, 171-191.

Furnham, A., (2001). Self-estimates of intelligence:Culture and gender differences in self and other estimatesof general (g) and multiple intelligences. Personality andIndividual Differences, 31,1381-1405.

Furnham, A., Clark, K., & Bailey K., (1999). Sexdifferences in estimates of multiple intelligences.European Journal of Personality, 13,247-259.

Furnham, A., Fong, G., & Martin, N., (1999). Sex andcross-cultural differences in the estimated multi-facetedintelligence quotient score for self, parents, and siblings.Personality and Individual Differences, 26, 1025-1034.

Furnham, A., Rakow, T., Samany-Schuller, I., & De Fruyt,F., (1999). European differences in self-perceived multipleintelligences. European Psychologist, 4, 131-138.

Furnham, A., & Rawles, R., (1995). Sex differences in theestimation of intelligence. Journal of Behavior andPersonality, 10,741-748.

Furnham, A., & Rawles, R., (1999). Correlations betweenself-estimated and psychometrically measured IQ.Journal of Social Psychology, 139,405-410.

Furnham, A., Shahidi, S., & Baluch, B., (2002 in press).Sex and culture differences in self-perceived and familyestimated multiple intelligence. A British- Iraniancomparison. Journal of Cross-Cultural Psychology.

16 Vol. 7, No.2, Winter 2002

Page 17: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Self-Estimated Intelligence (continued)

Furnham, A., & Ward, c., (2001 in press). Sexdifferences, test experience and the self-estimation ofmultiple intelligence. New Zealand Journal ofPsychology.

Gardner, H., (1983). Frames of mind: A theory of multipleintelligences. New York: Basic Books.

Gardner, H., (1999). Intelligence reframed. New York:Basic Books.

Hogan, H., (1978). IQ self-estimates of males andfemales. Journal of Social Psychology, 106, 137-138.

Mueller, c., & Dweck, c., (1998). Praise for intelligencecan undermine children's motivation and performance.Journal of Personality and Social Psychology, 75, 33-52.

Paulus, D., Lysy, D., & Yik, M., (1998). Self-reportmeasures of intelligence: Are they useful as proxy IQtests. Journal of Personality, 66,523-555.

Reilly, J., & Mulhern, G., (1995). Gender difference inself-estimated IQ: the need for care in interpreting groupdata. Personality and Individual Differences, 18, 189-192.

Spitz, H., (1999). Beleaguered Pygmalion: A history ofthe controversy over claims that teacher expectancy raisesintelligence. Intelligence, 27, 199-234.

Dr. Adrian Furnham is Professor ofPsychology at London University (UCL)where he has taught for 20 years. He holdsdoctoral degrees from both OxfordUniversity and UCL. He is the author of36 books and 450 peer-reviewed papers.Dr. Furnham is currently working on abook about managerial incompetence.

Every tool carries with it the spiritby which it has been created.

Werner Karl Heisenberg

Vol. 7, No.2, Winter 2002

HaPI Advisory Board

Aaron T. Beck, MDUniversity of Pennsylvania School of Medicine

Timothy C. Brock, PhDOhio State University, Psychology

William C. Byham, PhDDevelopment Dimensions International

Donald Egolf, PhDUniversity of Pittsburgh, Communication

Sandra J. Frawley, PhDYale University School of Medicine,Medical Informatics

David F. Gillespie, PhDWashington UniversityGeorge Warren Brown School of Social Work

Robert C. Like, MD, MSUniversity of Medicine and Dentistry ofNew Jersey, Robert Wood Johnson Medical School

Joseph D. Matarazzo, PhDOregon Health Sciences University

Vickie M. Mays, PhDUniversity of California at Los Angeles, Psychology

Michael S. Pallak, PhDBehavioral Health Foundation

Kay Pool, PresidentPool, Heller & Milne, Inc.--- --~~--------

Ora Lea Strickland, PhD~ ~~;A~-----L Emory University Woodruff School of Nursing

Gerald Zaltman, PhDHarvard University Graduate School ofBusiness Administration

Stephen J. Zyzanski, PhDCase Western Reserve University School ofMedicine

17 The Behavioral Measurement Letter

Page 18: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for SelectingSelf-Report Instruments

Fred B. Bryant

Selecting appropriate measurement instrumentsis among the tasks researchers most frequentlyface. Yet, surprisingly little has been writtenabout how best to go about the process ofinstrument selection. Given the prevalence ofself-report methods of measurement in the socialsciences, the task of selecting an instrumentmost often involves choosing from among a setof seemingly relevant questionnaires, surveys,inventories, checklists, and scales.

For example, a researcher who wishes tomeasure depression in college students mightlocate dozens of potentially useful self-reportinstruments designed to assess this construct.Indeed, the September 2001 release of theHealth and Psychosocial Instruments (HaPI)database contains 105 primary records of self-report instruments with the term "depression" inthe title, excluding measures developed for usewith children or the elderly, those written inforeign languages, and those assessing attitudestoward, knowledge of, or reactions to depressionrather than depression per se. The seeminglyappropriate instruments thus identified includethe Beck Depression Inventory (BDI; Beck,1987), Hamilton Rating Scale for Depression(HAM-D; Hamilton, 1960), Center forEpidemiologic Studies Depression Scale (CES-D; National Institute of Mental Health, 1977),and Self-Rating Depression Scale (SDS; Zung,1965), to name just a few. How should theresearcher decide which to use?

One strategy for selecting instruments is toemploy only those most commonly used inpublished studies. Not only is this strategysimple and straight-forward, but someresearchers follow it in the hope of increasingthe likelihood that their research will bepublished. However, it limits conceptualdefinitions to those created within the theoreticalframeworks of commonly used instruments.Over time, this effectively constricts thegeneralizability of research on these constructs.Further, all measurement instruments tapirrelevancies that have nothing to do with theconstructs they are intended to assess. Using

The Behavioral Measurement Letter

only a single measure of a particular researchconstruct makes it impossible to know thedegree to which the irrelevancies in the measureaffect the obtained results. Moreover, diversityin operationalization helps us better understandnot only what we're measuring, but also whatwe're not measuring. Thus, in the long run,employing only the most commonly usedinstruments limits and weakens the body ofscientific knowledge.

Although it is difficult to devise a universallyapplicable set of rules for selecting measurementinstruments, it is possible to suggest somegeneral guidelines that researchers can use inchoosing appropriate self-report measures. Whatfollows, then, is a set of precepts and principlesfor selecting instruments for research purposes,along with concrete examples illustrating each.Note that the order of presentation is notintended to reflect the guidelines' relativeimportance in the process of instrumentselection. In fact, each principle is essential inselecting the right measurement tool for the job.

1. Before choosing an instrument, define theconstruct you want to measure as precisely asfeasible. Ideally, researchers should not relymerely on-a label or descriptive term to representthe construct they wish to assess, but shouldinstead define the construct of interest clearlyand precisely in theory-relevant terms at theoutset. Being unable to specify beforehand whatit is you want to measure makes it hard to knowwhether or not a particular instrument is anappropriate measure of the target construct(Bryant, 2000). Potentially useful strategies fordefining research constructs are to draw onpublished literature reviews that synthesizeavailable theories concerning a particularconstruct, or to review the published literatureon one's own in search of alternative theoreticaldefinitions. If you can find alternativeconceptual definitions of the target construct,then you can choose from among them aparticular conception that resonates with yourown thinking. The process of explicitlyconceptualizing the construct that you wish tomeasure is known as "preoperationalexplication" (Cook & Campbell, 1979).

Imagine, for example, that a clinical researcherwants to use a self-report measure of shyness. A

18 Vol. 7, No.2, Winter 2002

Page 19: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for Selecting Self-ReportInstruments (continued)

wide variety of potentially relevant measurescan be found in the Health and PsychosocialInstruments database, depending on how theresearcher conceptually defines shyness. Isshyness (for which there are 30 primary recordsin the database) synonymous with introversion(30 primary records), timidity (17 primaryrecords), emotional insecurity (2 primaryrecords), social anxiety (63 primary records),social fear (1 primary record), social phobia (29primary records), social avoidance (4 primaryrecords), or social isolation (76 primaryrecords)? The clearer and more precise theinitial conceptual definition, the easier it will beto find appropriate measurement tools. An addedbenefit of precisely specifying target constructsat the outset is that it helps to focus the research.

Although precise preoperational explication isthe ideal when selecting measures, in actualpractice it is often difficult beforehand to specifya clear conceptual definition of the targetconstruct. Many times the published literaturedoes not provide explicit alternatives and this,then, forces researchers to explicate constructson their own -- the equivalent of trying to definean unknown word without having a dictionary.In actual practice, researchers often begin byselecting a particular instrument that appearsuseful, thus adopting by default the conceptualdefinition of the target construct that underliesthe chosen instrument. Truly, therefore, anavailable instrument often dictates one'sconceptual definition post hoc.

2. Choose the instrument that is designed totap an underlying construct whoseconceptual definition most closely matchesyour own. Carefully consider the theoreticalframework on which the originators based theirinstrument. Select an instrument that stems froma theory that defines the construct the same wayyou do, or at least in a way that does notcontradict your conceptual orientation, for thetheoretical underpinnings of the instrumentshould be compatible with your own conceptualframework.

Consider a sociologist and a psychologist, eachof whom wants to measure guilt. The most

Vol. 7, No.2, Winter 2002

appropriate self-report instrument in each casewill be the one whose underlying conceptualdefinition most closely corresponds to that of theresearcher. The sociologist, on the one hand,might be studying people's reactions tohomeless adults from a macro-level,sociocultural perspective. If so, then she mightbegin by defining guilt to be a pro social emotionexperienced when one perceives oneself asbeing better off than another person who isdisadvantaged. Consistent with thisconceptualization is Montada and Schneider's(1989) three-item measure of "existential guilt,"conceived as prosocial feelings in reaction to thesocially disadvantaged. The psychologist, on theother hand, might be studying personality froma micro-level, individual perspective. If so, thenshe might begin by defining guilt to be adispositional feeling of regret or culpability inreaction to perceived personal or moraltransgressions. Consistent with this conceptualdefinition is Kugler and Jones' (1992) 98-itemGuilt Inventory, which includes a separatesubscale specifically tapping personal guilt as anunpleasant emotional trait. Clearly, instrumentsmust have underlying conceptual definitions thatmatch your own conceptual framework(Brockway & Bryant, 1998).

3. Never decide to use a particular instrumentbased solely on its title. Just because the nameof an instrument includes the target constructdoes not guarantee that it either defines thisconstruct the same way you do or even measuresthe construct at all. Don't let the title lead you toselect an inappropriate instrument.

As a case in point, consider a developmentalpsychologist who wants to measure parents'psychological involvement in their families.Based on its promising title, the researcherdecides to use the ParentiFamily InvolvementIndex (PFII; Cone, DeLawyer, & Wolfe, 1985).After obtaining the instrument and inspecting itsconstituent items, the researcher realizes to hischagrin that the PFII requires a knowledgeableinformant (e.g., a teacher or teacher's aide) toindicate whether or not the particular parent of aschool-aged child has engaged in each of 63specific educational activities. Based on anunderlying conceptualization of familyinvolvement in psychological terms, a moreappropriate instrument would be a measure

19 The Behavioral Measurement Letter

Page 20: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for Selecting Self-ReportInstruments (continued)

developed by Yogev and Brett (1985) thatassesses parental involvement in terms of thedegree to which parents identify psychologicallywith and are committed to family roles. Thisexample shows clearly that you can't judge aninstrument by its title any more than you canjudge a book by its cover (Brockway & Bryant,1997).

'i

4. Choose an instrument for which there isevidence of reliability and validity. Reliabilityis measurement that is accurate and consistent.Good reliability in measurement strengthensobserved statistical relationships -- the morereliable the instrument, the smaller will be theerror in measurement, and the closer observedresults will be to actual results. For example,imagine a medical researcher who wants todetermine whether an experimental antipyreticagent reduces fever more rapidly than availableantipyretics, but who is using an unreliablethermometer that gives different readings overtime, even when body temperature is stable.These inconsistencies in measurement make itnearly impossible to assess temperatureaccurately, and greatly decrease the likelihood offinding any experimental effects.

i

Data supporting the validity of an instrumentincrease one's confidence that it really measureswhat it is designed to measure. For example, themedical researcher referred to above can bemore confident that the thermometer actuallymeasures temperature if its readings correlatehighly with those of infrared-telemetry or body-scanning devices. Although instrumentdevelopers sometimes report reliability andvalidity data, such empirical evidence is oftenavailable only in published studies that haveused the given measure. As a rule, avoid judgingthe validity of an instrument by the content of itsconstituent items. What an instrument appears tomeasure "on its face" (i.e., face validity) is notnecessarily what it actually measures. As in thecase of an instrument's title, what you see is notnecessarily what you get.

Judging the quality of research evidenceconcerning measurement reliability and validitycan be difficult and confusing. There are various

The Behavioral Measurement Letter

types of reliability (e.g., internal consistency,split-half, parallel-forms, interrater, test-retest)and of reliability coefficients (e.g., Cronbach'salpha, coefficient kappa, intraclass correlation,KR-20). Similarly, there are numerous types ofvalidity (e.g., construct, concurrent, criterion-referenced, convergent, discriminant). Thus ahost of specialized statistical tools has beendeveloped to quantify both reliability (Strube,2000) and validity (Bryant, 2000). Given thenumerous types of reliability and reliabilitycoefficients, validity, and tools used to assessreliability and validity, instrument selectionrequires at least a basic understanding ofpsychometrics and of principles of scalevalidation in order to make informed judgmentsof instrument quality. When there is no evidenceconcerning an instrument's reliability or validity,measurement becomes a "shot in the dark."

5. Given a choice among alternatives, selectan instrument whose breadth of coveragematches your own conceptual scope. If youdefine your target construct as having a widerange of requisite constituent components, thenchoose an instrument whose items tap a broadspectrum of content relating to thosecomponents. On the other hand, if you defineyour target construct in a way that specifies anarrower set of conceptual components, thenchoose an instrument that has a more restrictiveand specific content.

Breadth of coverage varies widely acrossavailable instruments. For example, to measurecoronary-prone Type A behavior, alternativesinclude: the Student Jenkins Activity Survey(Glass, 1977) that taps the Type A componentsof hard-driving competitiveness and speed-impatience; the Type A Self-Rating Inventory(Blumenthal, Herman, O'Toole, Haney,Williams, & Barefoot, 1985) that taps hard-drivingness and extraversion; the Type A/Type BQuestionnaire (Eysenck & Fulker, 1983) thattaps tension, ambition, and activity; the TimeUrgency and Perpetual Activation Scale(Wright, 1988) that taps activity and timeurgency; and the Self-Assessment Questionnaire(Thorensen, Friedman, Powell, Gill, & Ulmer,1985) that taps hostility, time urgency, andcompetitiveness. Clearly, your choice of

20 Vol. 7, No.2, Winter 2002

I

Page 21: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for Selecting Self-ReportInstruments (continued)

instrument depends on the specific componentsof Type A behavior that you want to investigate.

The choice between general versus specificmeasures of a given construct may also dependon your particular research question. Considerthe process of coping, for example. Numerousinstruments exist for measuring people's generalstyle of coping in response to stress. However, ifyou want to study coping in relation to a specificproblem or stressor, a host of other measureshave been developed to assess individualdifferences in coping with such specificconcerns as arthritis, asthma, cancer, chronicpain, diabetes, heart disease, hypertension,stroke, multiple sclerosis, spinal cord injury,HIV/AIDS, rape, sexual abuse, sexualharassment, pregnancy, miscarriage, childbirth,economic pressure, job stress, unemployment,depression, bereavement, natural disasters,prison confinement, test anxiety, and wartrauma, to name just a few. Compared to abroad-band measure of general coping, anarrow-band measure of coping that is specificto a particular stressor would be expected toshow stronger relationships with reactions tothat specific stressor.

6. Select an instrument that provides a levelof measurement appropriate to your researchgoals. Some instruments are based on atheoretical model in which the underlyingconstruct is assumed to be "unitary." Suchinstruments provide only a general, global "totalscore" that summarizes the overall level ofresponses. Other instruments are based on atheoretical framework in which the latentconstruct is considered to be"multidimensional." Such instruments providemultiple "subscale scores," each of which taps aseparate dimension of the underlying construct.Thus, if you want to gather global summaryinformation about a target construct, then use aunitary instrument. If you want to gatherinformation about multiple facets of a targetconstruct, then use a multidimensionalinstrument.

Imagine two nursing researchers, each of whomwishes to measure patients' life satisfaction. One

Vol. 7, No.2, Winter 2002

seeks a global summary of patients' overall lifesatisfaction, whereas the other wants to comparelevels of satisfaction across important aspects ofpatients' lives. The first researcher could use thefive-item Satisfaction with Life Scale (Diener,Emmons, Larsen, & Griffin, 1985) to obtain aglobal total score. The second researcher coulduse the 66-item Quality of Life Index (Ferrans &Powell, 1985) to obtain individual scores forfour separate satisfaction subscales: Health/Functioning, Socioeconomic, Psychological/Spiritual, and Family.

7. Choose an instrument with a time frameand response format that meet your needs.Don't use a "trait" measure (i.e., an instrumentthat defines the underlying construct as a stabledisposition) to assess a transient, situational"state" that you expect to change over time. Anddon't modify the labels on a response scale (e.g.,from "rarely" to "never") or the time frame ofmeasurement (e.g., from "in general" or "onaverage" to "during the past day" or "in the pasthour") unless you have no other recourse.Substantial changes in an instrument's responseformat or time frame can compromise itsconstruct validity and therefore requirerevalidation.

In measuring hostility, for example, the choiceof an appropriate instrument would depend onwhether you conceive of hostility as a transitory,variable state or a stable, dispositional trait. Anappropriate tool for measuring "state" hostilitymight be the 35-item Current Mood Scale(Anderson, Deuser, & DeNeve, 1995), which isdesigned to assess situational hostility, whereasan appropriate tool for measuring "trait"hostility might be the 50-item Cook-MedleyHostility Scale (Cook & Medley, 1954), whichis based on the Minnesota MultiphasicPersonality Inventory (MMPI; Hathaway &McKinley, 1989) and conceptualizes hostility asa personality trait. Clearly, when selectinginstruments you need to distinguish carefullybetween states and traits.

8. Match the reading level required tounderstand the items in the instruments youselect to the age of the intended respondents.In studying children or adolescents, for example,avoid using an instrument designed for use withadults. When in doubt, use word-processing' or

21 The Behavioral Measurement Letter

Page 22: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for Selecting Self-ReportInstruments (continued)

linguistic software to determine the readingability level required to understand aninstrument's constituent items.

Imagine a researcher who wants to measuredepression in children. Depending on theaverage age of the subjects, the researcher couldchoose from a variety of different self-reportinstruments specifically designed to assessdepression in children of various ages, includingthose with a first-grade reading level (Children'sDepression Inventory; Kovacs, 1985), childrenage 7-13 (Negative Affect Self-StatementQuestionnaire; Ronan, Kendall, & Rowe, 1994),children age 8-12 (Childhood DepressionAssessment Tool; Brady, Nelms, Albright, &Murphy, 1984), or children age 8-13(Depression Self-Rating Scale; Asarnow,Carlson, & Guthrie, 1987). In any case, theresearcher studying childhood depressionshould avoid adopting an instrument designed totap depression in adults.

9. Never use an instrument unless you knowhow you'll score it and how you'll analyze it.This rule may seem self-evident, but it issometimes violated unintentionally. No matterhow interesting or important an instrumentseems, it is useless unless you can convertresponses to it into meaningful data. Sometimesthe scoring rules are difficult to obtain or hard tofollow, particularly when an instrument consistsof multiple composite subscales, reverse-scoreditems, or item-specific scoring weights. Thissuggests that researchers should first make surethey know how to score an instrument beforethey administer it.

Consider the SF-12 (Ware, Kosinski, & Keller,1996), a 12-item self-report instrument designedto measure functional health status. At firstglance, it might appear simple enough to scorethis instrument by simply summing or averagingresponses to its constituent items. But the testmanual for the SF-12 (Ware, 1993) specifies acomplex set of mathematical computationsdesigned to weight and combine the 12 itemscores to produce composite scores reflectingmental, physical, and total functioning. Userscannot score the SF-12 correctly unless they

The Behavioral Measurement Letter

have access to the detailed scoring instructionscontained in the test manual. Clearly, then,administering an instrument is one thing, butscoring it can be an entirely different matter.

10. Rather than choosing only one measure,when feasible use multiple measures of theconstruct you wish to assess. A central tenet ofclassical measurement theory is that any singleway of measuring a construct has unavoidableidiosyncrasies that are unique to the measureand have nothing to do with the underlyingconceptual variable. By studying what multiplemeasures of the same construct have incommon, researchers can converge ortriangulate on the referent construct. Usingmultiple measures also allows an assessment ofthe generalizability of results across alternativeoperational or conceptual definitions, to probethe generality versus specificity of effects. Andin the long run, using multiple instruments willadvance our understanding of the targetedconstruct much further than simply using asingle instrument.

Even when following the ten guidelines forinstrument selection discussed above, you willstill sometimes face difficult, highly subjectivedecisions 'in choosing appropriate measures. Forexample, which mood measure is moreappropriate: one that uses a seven-pointresponse scale, or one that uses a four-pointresponse scale; one that includes a specific labelfor each individual point on its response scale, orone that has labels only at its endpoints; one thatassesses the frequency with which respondentsexperience certain feelings, or one that taps thepercentage of time respondents experiencecertain feelings? Given the subjectivity of suchdecisions, it makes all the more sense to usemultiple measures whenever possible so as toevaluate the generalizability of results acrossalternative operational definitions of the sameunderlying construct.

To recap, ICommandments"instruments:

1. Before choosing an instrument, definethe construct you want to measure asprecisely as feasible.

have suggested "Tenfor selecting self-report

22 Vol. 7, No.2, Winter 2002

Page 23: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

- -

Ten Commandments for Selecting Self-ReportInstruments (continued)

2. Choose the instrument that is designed totap an underlying construct whoseconceptual definition most closelymatches your own.

3. Never decide to use a particularinstrument based solely on its title.

4. Choose an instrument for which there isevidence of reliability and validity.

5. Given a choice among alternatives,select an instrument whose breadth ofcoverage matches your own conceptualscope.

6. Select an instrument that provides a levelof measurement appropriate to yourresearch goals.

7. Choose an instrument with a time frameand response format that meet yourneeds.

8. Match the reading level required tounderstand the items in the instrumentsyou select to the age of the intendedrespondents.

9. Never use an instrument unless youknow how you'll score it and how you'llanalyze it.

10. Rather than choosing only one measure,when feasible use multiple measures ofthe construct you wish to assess.

Following these guidelines will help you toselect instruments wisely.

AddendumObtaining Instruments Once Identified

Adhering to these ten guiding principles canhelp you identify appropriate measurementinstruments, but then you need to obtain themand permission to use them. Indeed, physicalavailability may ultimately dictate instrumentchoice. An instrument may be unavailable for avariety of reasons, including copyrightrestrictions, being out-of-print, or death of theprimary author. Given such obstacles, ratherthan trying to contact the original developer toobtain copies of an instrument, it may be best tocontact BMDS, Behavioral Measurement

Vol. 7, No.2, Winter 2002

Database Services, creator of the HaPI database,to secure permission to use an instrument, and toobtain a hardcopy of it along with any scoringinstructions. For a reasonable fee, BMDS willperform these services.

References

Anderson, c.x.. Deuser, W.E., & DeNeve, K.M. (1995).Hot temperatures, hostile affect, hostile cognition, andarousal: Tests of a general model of affective aggression.Personality and Social Psychology Bulletin, 21, 434-448.

Asarnow, J.R., Carlson, G.A., & Guthrie, D. (1987).Coping strategies, self-perceptions, hopelessness, andperceived family environments in depressed and suicidalchildren. Journal of Consulting and Clinical Psychology,55,361-366.

Beck, A.T (1987). Beck Depression Inventory (BDI). SanAntonio, TX: Psychological Corporation.

Blumenthal, lA., Herman, S., O'Toole, i.c., Haney, TL.,Williams, R.B., Jr., & Barefoot, J.e. (1985). Developmentof a brief self-report measure of the Type A (coronaryprone) behavior pattern. Journal of PsychosomaticResearch, 29, 265-274.

Brady, M.A., Nelms, B.C., Albright, A.V, & Murphy,C.M. (1984)". Childhood depression: Development of ascreening tool. Pediatric Nursing, 10, 222-225,227.

Brockway, lH., & Bryant, FB. (1997). Teaching theprocess of instrument selection in family research. FamilyScience Review, 10, 182-194.

Brockway, J.H., & Bryant, FB. (1998). You can't judge ameasure by its label: Teaching students how to locate,evaluate, and select appropriate instruments. Teaching ofPsychology, 25, 121-123.

Bryant, FB. (2000). Assessing the validity ofmeasurement. In L.G. Grimm & P.R. Yarnold (Eds.),Reading and understanding more multivariate statistics(pp. 99-146). Washington, DC: American PsychologicalAssociation.

Cook, TD., & Campbell, D.T (1979). Quasi-experimentation: Design & analysis issues for fieldsettings. Chicago: Rand McNally.

Cook, WW, & Medley, D.M. (1954). Proposed hostilityand pharisaic-virtue scales for the MMPI. Journal ofApplied Psychology, 38, 414-419.

Cone, J.D., DeLawyer, D.D., & Wolfe, VV (1985).Assessing parent participation: The Parent/FamilyInvolvement Index. Exceptional Children, 51, 4l7-424.

23 The Behavioral Measurement Letter

Page 24: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

Ten Commandments for Selecting Self-ReportInstruments (continued)

Diener, E., Emmons, R.A., Larsen, R.I., & Griffin, S.(1985). The Satisfaction with Life Scale. Journal ofPersonality Assessment, 49, 71-75.

Eysenck, H., & Fulker, D. (1983). The components ofType A behavior and its genetic determinants. Personalityand Individual Differences, 4, 499-505.

Ferrans, c.E., & Powell, M.J. (1985). Quality of LifeIndex: Development and psychometric properties.Advances in Nursing Sciences, 8, 15-24.

Glass, D.C. (1977). Behavior patterns, stress, andcoronary disease. Hillsdale, NJ: Lawrence Erlbaum.

Hamilton, M. (1960). A rating scale for depression.Journal of Neurology, Neurosurgery, and Psychiatry, 23,56-62.

Hathaway, S.R., & McKinley, J.c. (1989). MinnesotaMultiphasic Personality Inventory (MMPI). Minneapolis,MN: NCS Assessments.

Kovacs, M. (1985). The Children's Depression Inventory(CDI). Psychopharmacology Bulletin, 21, 995-998.

Kugler, K, & Jones, WH. (1992). On conceptualizing andassessing guilt. Journal of Personality and SocialPsychology, 62, 318-327.

Montada, L., & Schneider, A. (1989). Justice andemotional reactions to the disadvantaged. Social JusticeResearch, 3, 313-344.

National Institute of Mental Health (1977). Center forEpidemiologic Studies Depression Scale (CES-D).Rockville, MD: Author.

Ronan, KR., Kendall, rc, & Rowe, M. (1994). Negativeaffectivity in children: Development and validation of aself-statement questionnaire. Cognitive Therapy andResearch, 18, 509-528.

Strube, M.I. (2001). Reliability and generalizabilitytheory. In L.G. Grimm & P.R. Yamold (Eds.), Reading andunderstanding more multivariate statistics (pp. 23-66).Washington, DC: American Psychological Association.

Thoresen, C.E., Friedman, M., Powell, L.H., Gill, J.J., &Ulmer, D. (1985). Altering the Type A behavior pattern inpostinfarction patients. Journal of CardiopulmonaryRehabilitation, 5, 258-266.

Ware, J.E., Jr. (1993). SF-12 Health Survey (SF-12).Boston, MA: Medical Outcomes Trust.

The Behavioral Measurement Letter

Ware, J., Jr., Kosinski, M., & Keller, S.D. (1996). A 12-Item Short-Form Health Survey: Construction of scalesand preliminary tests of reliability and validity. MedicalCare, 34, 220-233.

Wright, L. (1988). The Type A behavior pattern andcoronary artery disease: Quest for the active ingredientsand the elusive mechanism. American Psychologist, 43, 2-14.

Yogev, S., & Brett, J. (1985). Patterns of work and familyinvolvement among single-and dual-earner couples.Journal of Applied Psychology, 70, 754-768.

Zung, WWK (1965). A Self-Rating Depression Scale,Archives of General Psychiatry, 12, 371-379.

Fred Bryant is Professor of Psychology atLoyola University, Chicago. He has roughly90 professional publications in the areas ofsocial psychology, personality psychology,measurement, and behavioral medicine. Inaddition, he has coedited 6 books, includingMethodological Issues in Applied SocialPsychology (1993; New York: Plenum). Dr.Bryant has extensive consulting experiencein a wide variety of applied settings,including work as a research consultant fornumerous marketing firms, medical schools,and public school systems; amethodological expert for the u.s.Government Accounting Office; and anexpert witness in several federal court casesinvolving social science research evidence.He is currently on the Editorial Board ofthe journal Basic and Applied SocialPsychology. His current research interestsinclude happiness, the measurement ofcognition and emotion, and structuralequation modeling.

Remember me,I'm HaPI at BMDS!

HaPII

Health and Psychosocial Instruments

24 Vol. 7, No.2, Winter 2002

Page 25: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

HaPI Thoughts...•...••.....•....•.......•..................•.......•.....••...........................•...•..........Search for measurement instrumentsin the Health and Psychosocial Instruments (HaPI)database with over 105,000 records of measurementinstruments online or on CD-ROM!

*-Produced by BMDS • Behavioral Measurement Database Services

if·'~~. SUBSCRIBE TO HaP\!~ Call Ovid Technologies at 800-950-2035 ext 6472

for pricing and to order today!

Obtain copies of measurement instrumentsfrom BMDS Instrument Delivery!

Call BMDS at 412-687-6850, fax 412-687-5213, ore-mail [email protected] ($20/$30 processing/handling fee)

•••••.•••••.••••.•.••...•..•...•......•.•......................•....•.•••••.........•••...•••..••.......

HaPI Thoughts

Good News!

Ovid Technologies now comprises both the Ovid and SilverPlatter platforms.Health and Psychosocial Instruments (HaPI) continues on the Ovid platformand can be found in the categories of Medicine & Allied Health. and SocialSciences.

Measurement Assistance?

Yes indeed! It's just a phone call away and with a smile you get searchingtips and suggestions for quick retrieval of the instruments you need.Call Evelyn Perloff. PhD. for Measurement Assistance from BMDS at 412-687-6850.

Vol. 7, No.2, Winter 2002 25 The Behavioral Measurements Letter

Page 26: The Behavioral. BEHAVIORAL Measurement MEASUREMENT Datab

••••••••••••• • •• • • •••••••••••••••••••••The

BEHAVIORALMEASUREMENT

Letter

BehavioralMeasurementDatabaseServices

•••••••••••• • ••••••••••• • • • ••••••••••• • • • • • •••••• •In This Issue:• Introduction to This Issue

Al K. DeRoy 1• Refinements to the Lubben

Social Network Scale: The LSNS-RJames Lubben, Melanie Gironda, andAlex Lee 2

• Self-Estimated IntelligenceAdrian Furnham 12

• Ten Commandments for SelectingSelf-Report InstrumentsFred B. Bryant 18

• HaPI Thoughts 25

'1VnI3.LVWa3.LVa

L8LO-ZSZSI Vd 'q'i3mqsn1d • L8ZOlI X0S: OdSg:J1AJgS

gSBqBlUalUgrnglnSBgw ·Saws

IB101ABqgs:.1()1J()7 lUgrnglnSBgw IB10~ABqgg ()If.L