The predictive and discriminant validity of the zone of proximal development

21
British Journal of Educational Psychology (2001), 71, 93–113 Printed in Great Britain © 2001 The British Psychological Society The predictive and discriminant validity of the zone of proximal development Joost Meijer* SCO-Kohnstamm Institution for Educational Research, University of Amsterdam, The Netherlands Jan J. Elshout Department of Psychology, University of Amsterdam, The Netherlands Background. Dynamic measurement procedures are supposed to uncover the zone of proximal development and to increase predictive validity in compar- ison to conventional, static measurement procedures. Aims. Two alternative explanations for the discrepancies between static and dynamic measurements were investigated. The rst focuses on Vygotsky’s learning potential theory, the second considers the role of anxiety tendency during test taking. If test anxious tendencies are mitigated by dynamic testing procedures, in particular the availability of assistance, the concept of the zone of proximal development may be super uous in explaining the differences between the outcomes of static and dynamic measurement. Sample. Participants were students from secondary education in the Nether- lands. They were tested repeatedly in grade three as well as in grade four. Participants were between 14 and 17 years old; their average age was 15.4 years with a standard deviation of .52. Method. Two types of mathematics tests were used in a longitudinal experi- ment. The rst type of test consisted of open-ended items, which participants had to solve completely on their own. With the second type of test, assistance was available to participants during the test. The latter so-called learning test was conceived of as a dynamic testing procedure. Furthermore, a test anxiety questionnaire was administered repeatedly. Structural equation modelling was used to analyse the data. Results. Apart from emotionality and worry, lack of self-con dence appears to be an important constituent of test anxiety. The learning test appears to contribute to the predictive validity of conventional tests and thus a part of Vygotsky’s claims were substantiated. Moreover, the mere inclusion of a test anxiety factor into an explanatory model for the gathered data is not suf cient. Apart from test anxiety and mathematical ability it is necessary to assume a factor which may be construed as mathematics learning potential. Conclusion. The results indicate that the observed differences between a conventional, static testing procedure and an experimental, dynamic testing * Requests for reprints should be addressed to Dr. J. Meijer, SCO Kohnstamm Institute for Educational Research, P.O. Box 94208, 1090 GE, Amsterdam, The Netherlands (e-mail: [email protected]).

Transcript of The predictive and discriminant validity of the zone of proximal development

Page 1: The predictive and discriminant validity of the zone of proximal development

British Journal of Educational Psychology (2001), 71, 93–113 Printed in Great Britain© 2001 The British Psychological Society

The predictive and discriminant validity of thezone of proximal development

Joost Meijer*SCO-Kohnstamm Institution for Educational Research, University of Amsterdam,

The Netherlands

Jan J. ElshoutDepartment of Psychology, University of Amsterdam, The Netherlands

Background. Dynamic measurement procedures are supposed to uncover thezone of proximal development and to increase predictive validity in compar-ison to conventional, static measurement procedures.

Aims. Two alternative explanations for the discrepancies between static anddynamic measurements were investigated. The �rst focuses on Vygotsky’slearning potential theory, the second considers the role of anxiety tendencyduring test taking. If test anxious tendencies are mitigated by dynamic testingprocedures, in particular the availability of assistance, the concept of the zoneof proximal development may be super�uous in explaining the differencesbetween the outcomes of static and dynamic measurement.

Sample. Participants were students from secondary education in the Nether-lands. They were tested repeatedly in grade three as well as in grade four.Participants were between 14 and 17 years old; their average age was 15.4years with a standard deviation of .52.

Method. Two types of mathematics tests were used in a longitudinal experi-ment. The �rst type of test consisted of open-ended items, which participantshad to solve completely on their own. With the second type of test, assistancewas available to participants during the test. The latter so-called learning testwas conceived of as a dynamic testing procedure. Furthermore, a test anxietyquestionnaire was administered repeatedly. Structural equation modellingwas used to analyse the data.

Results. Apart from emotionality and worry, lack of self-con�dence appearsto be an important constituent of test anxiety. The learning test appears tocontribute to the predictive validity of conventional tests and thus a part ofVygotsky’s claims were substantiated. Moreover, the mere inclusion of a testanxiety factor into an explanatory model for the gathered data is not suf�cient.Apart from test anxiety and mathematical ability it is necessary to assume afactor which may be construed as mathematics learning potential.

Conclusion. The results indicate that the observed differences between aconventional, static testing procedure and an experimental, dynamic testing

* Requests for reprints should be addressed to Dr. J. Meijer, SCO Kohnstamm Institute for EducationalResearch, P.O. Box 94208, 1090 GE, Amsterdam, The Netherlands (e-mail: [email protected]).

Page 2: The predictive and discriminant validity of the zone of proximal development

procedure for mathematics cannot be explained suf�ciently by a differentialbias towards test anxiety. The dynamic testing approach renders scores whichadd to the predictive validity of conventional testing procedures. Since thisgain in predictive validity is not a result of the removal of bias towards testanxiety, this result should be understood as supportive for the validity of theconcept of the zone of proximal development.

This article addresses the predictive and discriminant validity of the ‘zone of proximaldevelopment’, a concept that was put forward by Vygotsky (Vygotsky, Cole, John-Steiner, Scribner, & Souberman, 1978) ‘The zone of proximal development de�nesthose functions that have not matured yet but are in the process of maturation,functions that will mature tomorrow but are currently in an embryonic state. Thesefunctions could be termed the ‘‘buds’’ or ‘‘�owers’’ of development rather than the‘‘fruits’’ of development’ (Vygotsky et al., 1978, pp. 86–87). Vygotsky assumed thatunder most circumstances all �owers will produce fruit: ‘. . . what a child can do withassistance today it will be able to do by itself tomorrow’. Thus, measures of the breadthof the zone of proximal development should render better prospective indications ofsubsequent courses of learning than conventional test results. Consequently the pre-dictive validity of tests that measure it should be greater than the predictive validity oftests that aim solely at the measurement of independent achievement. Campione(1989) provides a taxonomy concerning recent developments in what he refers to as‘assisted assessment’. This taxonomy is based on two dimensions, i.e., standardisedversus clinical procedures and assessment of general versus domain-speci�c skills, thusrendering four types of assisted assessment. The research reported here falls within oneof these types, namely standardised assisted assessment within a particular domain ofskills. Assistance was standardised because it was offered by means of a computerprogram. It was domain-speci�c in that the tests were con�ned to mathematics subjectmatter.

Tests for measuring the breadth of the zone of proximal development, or ‘learningtests’ (Guthke, 1992) can be constructed according to two paradigms. One approach isto train participants on relevant tasks between a pretest and a posttest (Campione,1989; Resing, 1990). Another approach is to offer assistance to participants during testtaking (Ferrara, 1987; De Leeuw, Meijer, Perrenet, & Groen, 1988; De Leeuw &Meijer, 1989; Meijer, 1996). Guthke (1992) calls the former approach long-termlearning tests and the latter approach short-term learning tests. The �rst advantage ofthe short-term learning test paradigm is that it is not equivalent to evaluating the effectof an intervention between a pretest and a posttest, and secondly, it is less timeconsuming. Because traditional tests emphasise independent achievement, whereas the‘learning potential’ approach also takes achievements after direct teaching interventioninto account, the former are often called ‘static’ procedures whereas the secondapproach has been named ‘dynamic assessment’ (Schneider-Lidz, 1987). De Leeuw andMeijer (1989) con�rmed the above claim concerning the superior predictive power oflearning potential tests in comparison to conventional tests. They found that adding

94 Joost Meijer and Jan J. Elshout

Page 3: The predictive and discriminant validity of the zone of proximal development

learning potential scores to independent achievement scores in a regression equationpredicting future performance resulted in a somewhat higher amount of explainedvariance of a criterion test. The criterion test was administered approximately ninemonths after the learning potential test and the conventional test. The learningpotential scores were obtained by weighing correct answers according to the amount ofhelp needed. Although the contribution to predictive validity was small, it was statis-tically signi�cant.

The incremental predictive validity of dynamic assessment procedures could beascribed to the removal of bias, which may be inherent in the scores on traditional,static tests. The question arises as to the exact nature of the bias. Budoff (1987) believesthat fearfulness and low expectations of success, possibly paired with low abilityascriptions, are causes of under-achievement. Guthke (1992) also contends thatdynamic testing procedures render results that are less contaminated with personalitycharacteristics than static testing procedures: ‘Learning tests correlate at higher levelswith creativity tests, and at lower levels with vulnerability to stress, neuroticism andirritability, than static tests’ (p. 223). Moreover, Guthke emphasises the absence offeedback and help, which are so typical in everyday learning, in traditional, static tests.Campione (1989) asserts that the major drawback of standardised assisted assessmentis that it does not provide guidelines for improving instruction. This is beyond doubt.However, in this study, the improvement of instructional procedures is not our directconcern. For experimental purposes, feedback and assistance were incorporated intotesting procedures in order to increase the resemblance between learning and testingsituations.

Meijer (1993) showed that the gains in predictive validity which De Leeuw andMeijer (1989) had found earlier as a result of using a dynamic assessment procedure forhigh school mathematics capability were not signi�cant for students low in mathematicsanxiety, whereas they were highly signi�cant for students high in mathematics anxiety.This suggests that the superior predictive validity of learning tests in comparison toconventional, static tests might be explained by the fact that learning tests are lessbiased against highly anxious students than static tests. Therefore, the followingquestions may be considered. If dynamic testing procedures only remove the biasresulting from anxious tendencies, why should we need the concept of learningpotential to explain the discrepancies between the results of static and dynamic testingprocedures? Measures resulting from both procedures could then be explained by onlytwo factors, namely true ability and anxiety. On the other hand, if tests of independentachievement do not re�ect any differences in the breadth of the zone of proximaldevelopment, whereas learning potential tests do, the differences between both typesof measurement will certainly not be explained suf�ciently by the incorporation ofanxiety assessments into an explanatory model. In the study to be described here, it wasattempted to establish whether anxious tendencies could explain the differencesbetween the results of assessments of independent achievement and assessments oflearning potential in the domain of school mathematics.

In particular, the following hypothesis was investigated: Achievements on tests thatrespectively measure independent mathematics achievement and mathematics learn-ing potential, and estimates of anxiety levels, can be explained by two factors or latentvariables, namely ‘true’ mathematical ability and anxiety.

95Validity of learning potential measures

Page 4: The predictive and discriminant validity of the zone of proximal development

Vygotsky’s theory implies that independent achievement level and the breadth of thezone of proximal development are major determinants of future achievement. Ofcourse, other factors such as subsequent experience and exposure to teaching will alsoplay a role. Since Vygotsky’s theory also assumes that independent achievement andthe zone of proximal development are not perfectly correlated, they should beconstrued as distinct factors or latent variables. The constructs of fear of failure andanxiety play no role in Vygotsky’s theory. Vygotsky’s theory mainly concerns cognitivedevelopment as mediated by semiotic tools (Karpov & Haywood, 1998) and pays nospeci�c attention to affective factors such as anxiety which may in�uence test perform-ance. The implication of his theory is that there are at least three explanatory factorsneeded, when scores on independent achievements tests, learning potential tests andanxiety questionnaires are involved. Apart from ‘true’ mathematical ability and anxi-ety, the zone of proximal development should serve as the third explanatory factor.Nevertheless, the model proposed by the �rst author should be preferred if it can becon�rmed, since it is more parsimonious than the model following from Vygotsky’stheory (De Groot, 1981).

Test anxiety: worry and emotionalitySince the main focus of the present study concerns anxiety in educational settings, itwas decided to concentrate on test anxiety. Liebert and Morris (1967) introduced adistinction between worry and emotion as components of test anxiety. Worry isconceived of as the cognitive representation of anxiety, whereas emotion is theaffective representation, among others accompanied by physiological symptoms. Pre-sumably, the worry component of test anxiety has a much stronger negative effect ontest performance than the emotionality component, because it consumes part of theattention resources that are necessary to execute a complex task. Morris, Davis, andHutchings (1981) reviewed some of the evidence that was available at the time andconcluded that: ‘Worry is the anxiety component most strongly related (inversely) toacademic performance, whether it be examination scores or course grades’ (p. 543).Spielberger (1966, 1975) had already noted that highly anxious students were morepredisposed to academic failure than low anxious students, although he had not yetconsidered the distinction between emotion and worry. Sarason (1980) remarks that:‘Proneness to self-occupation and, most speci�cally, to worry over evaluation is apowerful component of what is referred to as test anxiety’ (p. 5). It is suggested that theworry component of test anxiety focuses on ruminations about the test result, theperson’s general abilities, the possibly adverse consequences of failing the test, and soon. It has been demonstrated in some cases (Blankstein, Toner, & Flett, 1989) thathighly test anxious participants report more interfering, irrelevant thoughts during testtaking than participants low in test anxiety.

Method

A test of the above hypothesis requires the availability of two measures of mathemat-ical ability: 1) scores derived from regular, conventional mathematics tests and 2)scores derived from mathematics learning tests, wherein help can be obtained bytestees if necessary. Conventional tests should indicate participants’ independent

96 Joost Meijer and Jan J. Elshout

Page 5: The predictive and discriminant validity of the zone of proximal development

mastery of mathematical knowledge and skills whereas experimental mathematicslearning tests should re�ect participants’ mathematics learning potential. Furthermore,measures of test anxiety tendency are required.

Instruments

The mathematics testsThe conventional mathematics tests were modelled after an examination with open-ended questions. Participants were required to write down their answers on the testforms, and their answers were scored by two judges according to guidelines which weredevised during pilot experiments (see Meijer, 1996). The interrater reliability ofsumscores derived with this method turned out to be quite suf�cient (r = .98). Partici-pants had no access to any assistance while they took the test. Figure 1 shows anexample of an item in one of the conventional mathematics tests.

In contrast, the learning potential tests were not pure paper-and-pencil tests.Although the mathematics problems to be solved were also presented on paper,solutions could be chosen from six alternatives, which were displayed on the screen ofa computer. One of the alternatives was: ‘I do not know the answer’, which was addedto discourage guessing by participants. Participants were encouraged to work out asolution by themselves, before requesting to see the alternatives from which they couldchoose. Help could be obtained in two different ways. On the one hand, a subject couldask for a hint explicitly. On the other hand, choosing a wrong alternative as a solution

Figure 1. An item of the conventional mathematics pretest

97Validity of learning potential measures

Page 6: The predictive and discriminant validity of the zone of proximal development

to the problem or admitting that one did not know which alternative was correct alsoresulted in the presentation of a hint.

Hints were multiple-choice questions as well, with only four alternatives, alsoincluding the answer ‘I don’t know’. If the subject answered the hint question correctly,positive feedback was provided, accompanied by an incitement to try to solve themathematics problem again, using the extra information that was obtained by consulta-tion of the hint. If the subject chose an incorrect alternative in response to the hintquestion or selected the alternative ‘I don’t know,’ extra explanation was given. Theextra explanation given to participants who chose an incorrect answer to the hintquestion was based on the assumption that the content of the hint would not be‘assimilated’ without it, whereas participants who answered the hint question correctlywere in no need for further explanation. This issue concerns the concurrence of hintcontent and cognitive process (De Leeuw & Meijer, 1989). Dynamic testing proceduresshould re�ect learning processes as closely as possible (Guthke, 1992). Although it isimperative that hint content and the ongoing cognitive process more or less converge,this can never be ensured in a standardised procedure (Campione, 1989). It is quiteimpossible to adjust to every individual learning mode while using identical hints forevery learner. Nevertheless, standardisation was deemed more important in the pres-ent study in order to preserve comparability of the effects of the hints on eachparticipant. A more elaborate description of the development of the form of these hintsis given by De Leeuw et al. (1988), De Leeuw and Meijer (1989) and Perrenet andGroen (1987). The control structure of the computer program that regulated thepresentation of hints and alternatives for the �nal answer is displayed in Figure 2.Before students began with a mathematics learning test, its features were explained tothem quite extensively, using an example item.

Since there were two hints available with each item, participants were allowed torespond three times. If the �rst response was incorrect, the �rst hint was presentedautomatically. A second incorrect response led to automatic presentation of the secondhint. If participants had already requested the �rst hint without having �rst made amistake, they were required to study and reply to the second hint after their �rstincorrect response. If both hints had already been presented, the choice of a falsealternative led to compulsory reconsultation of at least one of the hints. If the answerwas still incorrect at the third attempt, the correct answer was indicated and partici-pants were forced to go on to the next item.

In this sense, the learning potential test procedures that were applied in this study canbe conceived of as a combination of short-term learning tests, as proposed by Guthke(1992), and corrective testing procedures, as put forward by Arkin and Walts (1983)and Arkin and Schumann (1984). Figure 3 contains an example of an item in one of themathematics learning tests.

A score of three points was assigned if a participant gave a correct answer withoutrequiring hints. Two points were assigned if a participant needed one hint to �nd thecorrect solution. One point was assigned if two hints were required, but only if thecorrect answer was given immediately after consultation of the second hint. This wasdone in order to minimise the possibility that not the consultation of the second hint,but rather the elimination of previously incorrect answers, might explain the lastresponse of the subject. Although this scoring method appears to be the contrary of

98 Joost Meijer and Jan J. Elshout

Page 7: The predictive and discriminant validity of the zone of proximal development

counting the number of hints needed to arrive at a solution, it will merely result in aninverse relation with criterion measures. Actually, Ferrara (1987) used the formermethod and arrived at negative correlations between her dynamic scores and gainsbetween a pretest and a posttest, whereas the latter method will result in positivecorrelations with criterion measures (see the Results section).

The differences between the two types of mathematics tests concern the responseformat, i.e., multiple-choice versus essay-like items, and the availability of extrainformation and feedback on the correctness of the answer given. Obviously, thesedifferences threaten the comparability of the tests. This would be a particularly severeproblem if the type of mathematics test administered were a between-subjects variable.However, since all participants were required to take both tests, type of test constitutesa within-subjects variable.

The content of the items in the learning tests and the conventional tests was based ona study of the mathematics curriculum in both phases of the experiment, in order tomake sure that the participants were in principle able to solve the mathematicsproblems in the tests. Although some of the items in the tests were quite dif�cult due totheir unfamiliar context, they could be solved by applying mathematical knowledge andprocedures that had been taught. The two test types covered the subject matter that was

Figure 2. Control structure of the computer administered mathematics learning tests

99Validity of learning potential measures

Page 8: The predictive and discriminant validity of the zone of proximal development

taught in both grades, e.g., linear and quadratic functions in grade three and asymptoticand exponential functions in grade four. The tests did not contain any extra-curricularmaterial. One might argue that the items in the tests were of a conventional nature andthus not well suited to the dynamic test approach. However, since the main differencebetween the two test types should concern the availability versus the absence ofassistance, novelty of test item content appeared not to be warranted. In this respect,

Figure 3. An example of an item and the �rst hint in the mathematics learningpretest

100 Joost Meijer and Jan J. Elshout

Page 9: The predictive and discriminant validity of the zone of proximal development

this study focuses mainly on the incremental validity of the dynamic testing approach(Meijer, 1996). That is to say, it attempts to demonstrate the added value of administer-ing traditional test content dynamically rather than changing test content. Thus, testprocedure rather than test content is the central issue (Grigorenko & Sternberg, 1998,pp. 92–93).

The test anxiety questionnaireThe distinction between emotionality and worry in test anxiety is considered to be animportant development in test anxiety research. In view of this, it was decided to selecta questionnaire that pays explicit attention to these factors. Morris et al. (1981)conducted a factor analysis on 47 provisional test anxiety items. The �rst authortranslated all 47 items into Dutch and added a few items. For the preliminary version ofthe test anxiety questionnaire, 33 items were selected subsequently. Every item wasdesignated a priori as an item that either measured worry or emotionality. On the basisof the results of pilot experiments the test anxiety questionnaire was re�ned further(see Results section).

DesignIn view of the claims concerning the prognostic value of learning potential measures alongitudinal experimental design was deemed desirable. Table 1 displays its design.

In the �rst phase of the experiment participants were instructed concerning thetesting procedures at the beginning of both sessions. Then they �lled out the testanxiety questionnaire and proceeded to take a mathematics test, either the conven-tional pretest or the learning pretest. In order to control for possible effects of the orderin which the types of tests were taken, participants who took the conventional pretest�rst took the learning pretest approximately two to �ve weeks later and vice versa. Thisprocedure was repeated in the second phase of the longitudinal experiment, which wasconducted approximately six to nine months later. During the �rst phase of theexperiment participants were in their third grade of secondary education. The secondphase of the experiment took place in the fourth grade. Mathematics posttests wereused that were tuned to the mathematics subject matter which had been treated duringthe six to nine months intervening period. Thus, for both types of mathematics tests, a

Table 1. Design of the longitudinal experiment

Firstsession

Interveningperiod

Secondsession

Interveningperiod

Thirdsession

Interveningperiod

Fourthsession

TAQ CMT 2 to 5weeks

TAQ MLT 6 to 9months

TAQ CMT 3 to 6weeks

TAQ MLT

TAQ MLT 2 to 5weeks

TAQ CMT 6 to 9 months

TAQ MLT 3 to 6weeks

TAQ CMT

Note: TAQ =Test Anxiety Questionnaire; CMT = Conventional Mathematics Test;MLT = Mathematics Learning Test. Number of participants involved in �rst session withcomplete data is 302, in second session 279, in third session 199, in fourth session 183.

101Validity of learning potential measures

Page 10: The predictive and discriminant validity of the zone of proximal development

pretest as well as a posttest was administered. The test anxiety questionnaire wasadministered four times, before each mathematics test.

ParticipantsThe experiment was conducted in seven schools, mostly during lesson hours whichwere otherwise allotted to regular mathematics education. Participants in the �rstphase of the longitudinal experiment were in the third grade of secondary education.They were between 14 and 17 years old, their average age was 15.4 years with astandard deviation of .52. Fifty-one percent of the participants was female, 49% wasmale. There were no data available concerning participants’ levels of intelligence.However, it may be assumed that the sample was relatively homogeneous in thisrespect, since they were all selected from the type of secondary education in theNetherlands which either prepares for higher vocational training or for universityeducation. A number of these participants were retested about nine months later, afterthey had transferred to the fourth grade in the same school. Since many participants didnot transfer to the next grade within the same school, the initial number of 315participants was severely reduced. Although complete mathematics test data weregathered for 186 participants in the second phase of the longitudinal experiment, dataconcerning quite a few participants had to be discarded because they did not �ll out thetest anxiety questionnaires on every test occasion or were absent during the administra-tion of one or both mathematics pretests. Due to this cumulative loss of participantsduring the longitudinal experiment, the number of participants who could be includedin statistical analyses varied between 305 and 158, depending on the data that wereanalysed. Sample attrition is obvious and it may endanger the generalisability of theresults. However, it is inherent to longitudinal research. Tracing lost participants is verydif�cult. Moreover, it would have been senseless to confront partipicants who did nottransfer to the next grade with tests which covered subject matter that was not taughtto them.

Results

Pilot experiments (see Meijer, 1996) had shown that test anxiety should not bedecomposed only into emotionality and worry, but rather contains a third factor, thatcan be construed as lack of self-confidence. The eight-item scale measuring this factorcontains questions such as: ‘I hardly feel con�dent about my performance on themathematics test’. The results of the pilots had also shown that lack of self-con�dencewas more strongly negatively related to mathematics test performance in comparisonto worry and emotionality.

Con�rmatory factor analysis was applied to estimate the tenability of the three-factor model, containing emotionality, worry and lack of self-con�dence, in comparisonto a two-factor model, wherein the self-con�dence factor was omitted. The three-factormodel was contrasted with the two-factor model using the data derived from all fourtesting occasions. Two observed measures for each factor were calculated by randomlyassigning half of its item scores to either measure. In Table 2, the goodness of �t indicesof the three-factor model are compared with the indices for a two-factor model acrossthe data derived from the four administrations of the test anxiety questionnaire.

102 Joost Meijer and Jan J. Elshout

Page 11: The predictive and discriminant validity of the zone of proximal development

The �gures in Table 2 show that the three-factor model explains the observedcovariance matrices much better than the two-factor model. The x 2-values for thethree-factor model are all signi�cantly lower than x 2-values for the two-factor model.The three-factor model should also be preferred on the basis of the values of thegoodness �t indices. Lack of self-con�dence appears prominent as the third constitutingfactor of test anxiety. As was explained in the introduction, measures of the breadth ofthe zone of proximal development should add to the predictive validity of measures ofindependent achievement. De Leeuw and Meijer (1989) con�rmed this. However, inthe experiment that was conducted by these investigators, independent achievementscores and measures of the breadth of the zone of proximal development were derivedfrom one and the same test. The data from the present experiment render theopportunity to test the same hypothesis, based on independently gathered indicators ofconventional test performance and the breadth of the zone of proximal development.Table 3 gives an overview of the correlations between independent mathematicsachievement, as measured by the conventional mathematics tests, and the breadth ofthe zone of proximal development, as measured by the mathematics learning tests. Theelapsed time between the pre- and posttests of both types of mathematical competencewas approximately six to nine months.

It can be seen that both ways of assessing mathematical competence correlate higheramong themselves than between each other. The correlation between the conventionalpre- and posttest and the correlation between the learning pre- and posttest are higherthan any of the other correlations. It can be deduced immediately from Table 3 that theconventional mathematics pretest score is the best predictor for conventional mathe-matics posttest performance, since all the other test scores correlate lower with thelatter criterion. However, this reveals no information concerning the possibleadditional or incremental predictive validity of the other mathematics test scores. Themain interest concerns the prognostic value of the mathematics learning pretest. Thecontribution of this test and the other mathematics tests to the accuracy of theprediction of conventional mathematics posttest performance can be examined byapplying multiple regression analysis. Table 4 summarises the results of this analysis.

All other mathematics test scores contribute signi�cantly to the explanation of thevariance of conventional mathematics posttest scores. The �rst predictor, conventionalmathematics pretest performance, explains 44% of the criterion test variance. Inclusion

Table 2. EQS estimates concerning the factor structure of test anxiety

Three-factor model Two-factor modelTO x 2 d.f. GOF AGFI p NFI x 2 d.f. GOF AGFI p NFI

I. 7.44 6 .987 .956 .282 .974 52.23 8 .911 .767 .000 .821II. 12.67 6 .972 .902 .049 .934 55.79 8 .877 .677 .000 .710III. 20.33 6 .952 .832 .002 .901 63.41 8 .851 .608 .000 .690IV. 11.98 6 .966 .882 .062 .929 57.74 8 .837 .573 .000 .657

Note: TO =Test occasion; I and II concern the mathematics pretests; III and IV concern theposttests; GOF = goodness of �t index; AGFI = goodness of �t, adjusted for degrees offreedom; NFI = Bentler-Bonett normed �t index. Number of participants involved in �rstsession with complete data is 308, in second session 287, in third session 205, in fourth session191.

103Validity of learning potential measures

Page 12: The predictive and discriminant validity of the zone of proximal development

of the second predictor, mathematics learning pretest performance, leads to an addi-tional 7% of explained variance of the criterion variable (R2 = .51). Although thecontribution of mathematics learning posttest scores is also statistically signi�cant, itmerely concerns an additional 1% (R2 = .52).

The results of this analysis corroborate the �ndings of De Leeuw et al. (1988). Asidefrom conventional mathematics pretest performance, mathematics learning pretestperformance contributes signi�cantly to the prediction of conventional mathematicsposttest performance. This con�rms that measures of the breadth of the zone ofproximal development have additional prognostic value on top of measures of inde-pendent achievement. Whereas measures of the breadth of the zone of proximaldevelopment in the experiment conducted by De Leeuw et al. only added approx-imately 2% to the explained variance of the criterion variable, this is a rather moresubstantial 7% in the present experiment. This is a small contribution in absolute terms.However, it is comparable to the approximate average gain in prediction of achieve-ment levels in school effectiveness research (Roeleveld, 1994). This gain implies thatthe accuracy of the prediction of future performance may be improved signi�cantly forsubstantial numbers of students by applying dynamic assessment procedures.

Although these results con�rm that the mathematics learning pretest scores contrib-ute to the prediction of conventional mathematics posttest scores, the possibility thatthe conventional mathematics pretest scores have a similar power towards the predic-tion of mathematics learning posttest performance cannot yet be excluded. If this wereindeed the case, the meaning of the incremental prognostic value of learning potential

Table 3. Correlations between scores on the mathematics tests

Conventionalmathematicspretest

Mathematicslearningpretest

Conventionalmathematicsposttest

Mathematicslearningposttest

Conventional mathematicspretest

1.000

Mathematics learningpretest

.506 1.000

Conventional mathematicsposttest

.664 .558 1.000

Mathematics learningposttest

.489 .644 .536 1.000

Note: Correlations based on complete data from 160 participants

Table 4. Prediction of conventional posttest performance

Predictor B b T p

Conventional pretest .668 .476 7.163 < .001Learning pretest .516 .212 2.818 .006Learning posttest .368 .162 2.161 .032

Note: B = unstandardised regression coef�cient; b = standardised regression coef�cient;T = Student t-value, associated with the size of the regression coef�cients; p = signi�cance of T.N = 160

104 Joost Meijer and Jan J. Elshout

Page 13: The predictive and discriminant validity of the zone of proximal development

measures, operationalised as the total score on the mathematics learning pretest,should be questioned. After all, Vygotsky’s theory implies that tests of independentachievement should have little prognostic value towards future learning potential.Therefore, another stepwise multiple regression analysis was performed, whereinmathematics learning posttest scores were the dependent variable (see Table 5).

This analysis reveals that conventional pretest performance is not a signi�cantpredictor for learning posttest performance. Learning pretest performance is the mainpredictor, explaining 41% of the variance in learning posttest achievement. Addition ofconventional posttest scores to the regression equation renders an extra 5% ofexplained variance in the criterion variable (R2 = .46). Adding conventional pretestscores to the equation does not lead to a signi�cantly higher multiple regressioncoef�cient, indicating no substantial gain in explained variance of the criterion variable.Thus, on the one hand, we may conclude that mathematics learning test performance ingrade three adds signi�cantly to the prediction of conventional mathematics testperformance in grade four, on top of the predictive validity of conventional mathe-matics test performance in grade three. On the other hand, mathematics learning testperformance in grade four is chie�y predicted by mathematics learning test perform-ance in grade three and subsidiarily by conventional mathematics test performance ingrade four, but not by conventional test performance in grade three. In other words,mathematics learning pretest performance has incremental predictive validity vis à visconventional mathematics posttest performance, but conventional mathematics pretestperformance has no incremental predictive validity vis à vis mathematics learningposttest performance. It might therefore be ventured that learning potential measuresare prognostic whereas measures of independent achievement are mainly retro-spective. This is in accordance with Vygotsky’s claims that testing procedures that aredesigned to measure independent achievement mainly assess fully matured cognitivefunctions and structures, whereas testing procedures which are designed to measurelearning potential also assess those cognitive functions and structures that are stilldeveloping.

Although the incremental predictive validity of the learning potential construct hasbeen con�rmed by the regression analyses described above, the latter do not providesuf�cient evidence for the rejection of the hypothesis under consideration, since theanxiety bias factor was not yet taken into account. Structural equation modelling willshed more light on its tenability. The hypothesis claims that two factors, namelymathematical ability and test anxiety, should suf�ce to explain the discrepancies

Table 5. Prediction of mathematics learning posttest performance

Predictor B b T p

Learning pretest .535 .501 7.083 < .001Conventional posttest .113 .256 3.627 < .001

Note: B = unstandardised regression coef�cient; b = standardised regression coef�cient;T = Student t-value, associated with the size of the regression coef�cients; p = signi�cance of T.N = 160

105Validity of learning potential measures

Page 14: The predictive and discriminant validity of the zone of proximal development

between the performance on mathematics learning tests and conventional mathematicstests.

The model that is in accordance with Vygotsky’s theory is shown in Appendix 1. Thetenability of this model for explaining the data will be compared to the tenability of amore restricted model. The restricted model does not contain a factor that representsthe construct of the zone of proximal development, whereas the Vygotskyan modeldoes. Table 6 summarises goodness of �t indices for the restricted model and theVygotskyan or learning potential model. Tests of the restricted model using themaximum likelihood estimation method reveal that it is highly improbable that itapplies to the population ( x 2

(147) = 481.9, p < .001).The distributions of test anxiety scores appeared to deviate from normality. Because

maximum likelihood estimation may not be adequate for the analysis of data that arenot multivariately normally distributed, the so-called iteratively reweighed generalisedleast squares (ERLS) estimation method was also applied. The latter estimationmethod is based on elliptical distribution theory (Bentler, 1993).

In the model that is implied by Vygotsky’s theory, a latent variable plays a role whichis not present in the restricted model, namely mathematics learning potential in gradethree. Only the split-part scores of the mathematics learning pretest are assumed toload on this factor (parameters l 3 94 and l 3 10–4 in the model depicted in Appendix 1).Furthermore, mathematics learning potential in grade three is assumed to predictmathematical ability in grade four aside from mathematical ability in grade three(parameter g 34, see Appendix 1). In comparison to the restricted model, three newparameters are introduced and consequently three degrees of freedom are lost.Goodness of �t indices in Table 6 are systematically in favour of the learning potentialor Vygotskyan model. The question as to which model should be preferred can beresolved by comparing the associated x 2-values. If the difference between the x 2-valuesassociated with both models is statistically signi�cant when tested with the difference ofdegrees of freedom of both models, then the less restricted model, which in this case isthe Vygotskyan model, should be accepted at the expense of the parsimonious model(Jöreskog & Sörbom, 1988). The 5% critical value of x 2 with three degrees of freedomis 7.82, which is smaller than the difference between the x 2-values associated with bothrespective models. The differences between the x 2-values of the parsimonious modeland the learning potential model in Table 6 based on ML and ERLS estimates thusshow that the learning potential or Vygostkyan model should be accepted at the

Table 6. Goodness of �t for the base model and the Vygotskyan model

Model Method x 2 d.f. NFI NNFI CFI

Restricted model ML 481.94 147 .843 .849 .883ERLS 438.05 147 .924 .933 .948

Vygotskyan model ML 447.47 144 .854 .861 .894ERLS 394.20 144 .932 .941 .955

Note: ML = maximum likelihood; ERLS = iteratively reweighed least squares; NFI = Bentler-Bonett normed �t index; NNFI = Bentler-Bonett non-normed �t index; CFI = Bollencomparative �t index

106 Joost Meijer and Jan J. Elshout

Page 15: The predictive and discriminant validity of the zone of proximal development

expense of the restricted model, which was proposed by the hypothesis presented in theintroduction. The restricted model is more parsimonious than the learning potentialmodel in the sense that it excludes the role of mathematics learning potential. However,the learning potential model explains the covariance structure of the data signi�cantlybetter than the restricted model, thus supporting the validity of the learning potentialconstruct. Of particular interest is the signi�cance of the regression parameter thatrepresents the causal effect of mathematics learning potential in grade three onmathematical ability in grade four ( g 34, see Appendix 1). This parameter re�ects anessential corollary of Vygotsky’s theory. EQS estimates show that this parameter isstatistically signi�cant (unstandardised value = 1.586, standard error = .383, z = 4.146,standardised value = .457).

The learning potential or Vygotskyan model shows better �t than the restrictedmodel where ML and ERLS estimation are concerned. It may therefore be concludedthat the model, wherein the learning potential construct is an important constituent,should be preferred over the model wherein this construct is omitted.

ConclusionThe hypothesis which was investigated claims that two factors, namely mathematicalability and latent anxiety, are suf�cient to explain respectively conventionally meas-ured mathematics achievement, measures of mathematics learning test achievementand measures of test anxiety. On the other hand, Vygotsky’s theory implies a compet-ing hypothesis asserting that a third factor is necessary to explain the discrepanciesbetween conventional measurements and measures derived from learning test proce-dures. This factor should represent the so-called ‘zone of proximal development’, ahallmark of intellectual development that cannot be assessed by using conventionaltesting procedures, since these are assumed to be of a retrospective nature rather thanof a prognostic nature.

Multiple regression analysis was used to inspect the prognostic value of the mathe-matics learning pretest. The accuracy of the prediction of conventional posttestperformance was increased by the use of the mathematics learning pretest scores as anextra predictor aside from conventional pretest performance. Addition of mathematicslearning pretest performance to the regression equation led to an extra 7% of explainedvariance of the criterion. This result underlines the prognostic value of the scores,which were derived from the mathematics learning pretest. Mathematics learningposttest performance was predicted signi�cantly only by mathematics learning pretestscores and conventional posttest scores. Addition of conventional pretest scores to thepredictor set did not signi�cantly improve the accuracy of the prediction of mathe-matics learning posttest performance. Although conventional pretest performance hasprognostic value vis à vis conventional posttest performance, it appears that this is notso vis à vis the prediction of mathematics learning posttest performance. Mathematicslearning pretest performance, on the other hand, appears to have incremental pre-dictive validity towards conventional mathematics posttest performance. These areindications that conventional tests are of a retrospective nature rather than of aprognostic nature towards learning test performance, whereas learning tests add to theaccuracy of the prediction of conventional test performance.

The validity of the concept of the zone of proximal development was further

107Validity of learning potential measures

Page 16: The predictive and discriminant validity of the zone of proximal development

supported by the comparison of the Vygotskyan structural model with a more restric-tive structural model. Incorporation of a latent variable representing the breadth of thezone of proximal development at the pretests into this base model signi�cantlyimproved the �t to the observed covariance structure. Furthermore, the parameter inthe Vygotskyan model that represents the additional predictive validity of the breadthof the zone of proximal development at the pretests towards mathematical ability at theposttests was statistically signi�cant. We cannot be entirely con�dent that the mathe-matics learning tests used in this experiment actually measure the concept of learningpotential as postulated by Vygotsky’s theory. However, it is obvious that the learningpretest measures some aspect of mathematical ability that is not measured by theconventional pretest. Furthermore, the addition of measures of this aspect to thepredictor set improves the accuracy of the prediction of conventional posttest per-formance.

Discussion

It must be emphasised that the purpose of the present study was not merely to defyVygotsky’s views on learning potential and the zone of proximal development. Rather,the experiment was designed crucially. Dependent on the results, either the validity ofthe concept of the zone of proximal development, or the more parsimonious explana-tion of the source of the discrepancies between conventional and learning tests, asproposed by the �rst author, should be accepted. Thus, either result would constitute acontribution to psychological theory. It cannot be maintained, however, that the resultsof this study produced unequivocal evidence for the construct validity of the zone ofproximal development. This study has only concentrated on the discriminant validity ofprocedures for measuring the zone of proximal development by contrasting conven-tional test procedures with a particular learning test procedure, namely offeringassistance to participants during testing. In order to establish the status of the conceptof the zone of proximal development more �rmly, its concurrent validity should also beassessed. This means that the results of alternative procedures to measure the zone ofproximal development should be investigated together with the approach used in thisstudy. For this purpose, the traditional pretest-training-posttest design (Budoff, 1987;Resing, 1990) could be considered.

Notwithstanding these limitations, the results of this experiment corroborated earlier�ndings of De Leeuw et al. (1988) and have shown again that learning test proceduresrender reliable and valid results. Some researchers claim that measurements of thebreadth of the zone of proximal development require radical changes in the conceptionof assessment (Wertsch, 1984; Smagorinsky, 1995). The results of this study show thatrelatively minor changes in assessment procedures, such as offering optional helpduring testing, can also result in valid estimates of the breadth of the zone of proximaldevelopment.

The possibility that measures of mathematics learning potential add to the predictivevalidity of conventional mathematics test scores, because they are more stronglydetermined by intelligence compared to the latter, cannot be excluded. Since help isavailable, learners need not depend almost entirely on previously acquired mathemat-ical knowledge. If it is assumed that the role of intelligence will increase in proportion

108 Joost Meijer and Jan J. Elshout

Page 17: The predictive and discriminant validity of the zone of proximal development

to the diminution of the role of highly speci�c knowledge, performance on the learningtests will depend more strongly on intelligence than performance on the conventionaltests. Since general intelligence is most probably quite important for the developmentof mathematical ability aside from mere accumulation of mathematical knowledge, itcannot be entirely excluded that the incremental predictive validity of learning tests isbased on the measurement of intelligence rather than the measurement of the breadthof the zone of proximal development. However, one could also argue that the conceptof the zone of proximal development constitutes an essential addition to the constructof intelligence. Elshout (1983) argued that the construct of intelligence should beelaborated as to comprise ‘the level of our actually useful abilities, basic or complex,general or speci�c, depending on the demands that characterise the situation. One maythink of these most relevant, directly useful capabilities as being those that de�ne whatVygotsky (1963) has called the ‘‘zone of proximal development’’ ’ (p. 52). Meijer (1999)found that measures of achievements that were obtained after assistance had beenoffered contributed signi�cantly to the prediction of future performance, even afterlevel of intelligence was accounted for as well. The aspects of intelligence that weremeasured were �gural reasoning and verbal analogies. Thus, the contribution oflearning test performance to the prediction of future achievement appears to bepartially independent of certain aspects of intelligence.

Another question that may come to mind is the generalisability of the learning testapproach to other educational domains of knowledge and skills apart from mathe-matics. Up until now, the learning test approach to assessment has mainly been used inrather formal, abstract domains, such as intelligence testing (Badad & Budoff, 1974;Feuerstein, 1979; Resing, 1990) and arti�cial learning environments (Guthke, 1992).Although the present application of the learning test approach in the domain of schoolmathematics (see also De Leeuw & Meijer, 1989) is more strongly related to educa-tional practice, one might still argue that mathematics is a rather formal,well-structured domain of knowledge. Following this line of argument, it may bequestionable if learning tests could be designed for less well-structured knowledgedomains, such as history, geography or language teaching. The main constraint on thestandardised learning test design in this respect is that multiple choice items were used.This was necessary in order to offer standardised assistance and feedback to partici-pants when this was deemed appropriate. It has frequently been argued that multiplechoice tests lack authenticity (Bennet & Ward, 1993) and should be replaced by or atleast supplemented with other test types, such as essay assignments or learning projects.However, one could conceive of learning tests as exactly serving this supplementation,since they are speci�cally meant to add to information gathered on the progress ofknowledge and skill acquisition by means of more conventional testing methods. Thus,there appears to be no reason to restrict the learning test approach to assessment inparticular educational domains.

References

Arkin, R.M., & Schumann, D.W. (1984). Effects of corrective testing: An extension. Journal ofEducational Psychology, 76(5), 835–843.

Arkin, R.M., & Walts, E.A. (1983). Performance implications of corrective testing. Journal ofEducational Psychology, 75(4), 561–571.

109Validity of learning potential measures

Page 18: The predictive and discriminant validity of the zone of proximal development

Badad, E.Y., & Budoff, M. (1974). Sensitivity of learning potential measurement in three levelsof ability. Journal of Educational Psychology, 3(66), 439–447.

Bennet, R.E., & Ward, W.C. (Eds.) (1993). Construction versus choice in cognitive measurement:Issues in constructed response, performance testing and portfolio assessment. Hillsdale, NJ:Lawrence Erlbaum.

Bentler, P.M. (1993). EQS. Structural equations program manual. Los Angeles: BMDP StatisticalSoftware.

Blankstein, K.R., Toner, B.B., & Flett, G.L. (1989). Test anxiety and the contents of conscious-ness: Thought-listing and endorsement measures. Journal of Research in Personality, 23(3),269–286.

Budoff, M. (1987). Measures for assessing learning potential. In C. Schneider Lidz (Ed.),Dynamic assessment (pp. 173–195). New York: Guilford Press.

Campione, J.C. (1989). Assisted assessment: A taxonomy of approaches and an outline ofstrengths and weaknesses. Journal of Learning Disabilities, 22(3), 151–165.

Elshout, J.J. (1983). Is measuring intelligence still useful? In S.B. Anderson & J.S. Helmick(Eds.), On educational testing (pp. 45–56). San Francisco: Jossey-Bass.

Ferrara, R.A. (1987). Learning mathematics in the zone of proximal development: The importanceof flexible use of knowledge. Unpublished doctoral dissertation, University of Illinois,Champaign.

Feuerstein, R. (1979). The dynamic assessment of retarded performers. The learning potentialassessment device, theory, instruments and techniques. Baltimore, MD: University ParkPress.

Grigorenko, E.L., & Sternberg, R.J. (1998). Dynamic testing. Psychological Bulletin, 124(1),75–111.

Groot, A.D. de (1981). Methodologie. ’s-Gravenhage: Mouton.Guthke, J.J. (1992). Learning tests: The concept, main research �ndings, problems and trends. In

J.S. Carlson (Ed.), Advances in cognition and educational practice (pp. 34–52). London: JAIPress.

Jöreskog, K.G., & Sörbom, D. (1988). LISREL 7. A guide to the program and applications.Chicago: SPSS.

Karpov, Y.V., & Haywood, H.C. (1998). Two ways to elaborate Vygotsky’s concept of media-tion. American Psychologist, 53(1), 27–36.

Leeuw, L. de, & Meijer, J. (1989). Coaching students during computer guided transfer problemsolving: The construction of a transfer test containing items with incremental help. In P. Span,E. De Corte, & B. Van Hout-Wolters (Eds.), Onderwijsleerprocessen; Strategieën voor deverwerking van informatie (pp. 37–45). Amsterdam: Swets & Zeitlinger.

Leeuw, L. de, Meijer, J., Perrenet, J.C., & Groen, W.E. (1988). De constructie en validering vaneen transfertest voor wiskunde-onderwijs met gebruikmaking van items met gefaseerde hulp.Amsterdam: Vrije Universiteit.

Liebert, R.M., & Morris, L.W. (1967). Cognitive and emotional components of test anxiety: Adistinction and some initial data. Psychological Reports, 20, 975–978.

Meijer, J. (1993). Learning potential, personality characteristics and test performance. In J.H.M.Hamers, K. Sijtsma, & A.J.J.M. Ruijssenaars (Eds.), Learning potential assessment;Theoretical, methodological and practical issues (pp. 341–362). Amsterdam: Swets &Zeitlinger.

Meijer, J. (1996). Learning potential and fear of failure. A study into the predictive validity oflearning potential and the role of anxious tendency. Doctoral dissertation, University ofAmsterdam. Amsterdam: Guus Bauer.

Meijer, J. (1999). Leerpotentieel en intelligentie: voorzetting van een debat over een complexeproblematiek. In R. Hamel, M. Elshout-Mohr, & M. Milikowski (Eds.), Meesterschap. Zestienstukken over intelligentie, leren, denken en probleemoplossen voor Jan J. Elshout (pp. 123–134).Amsterdam: Vossiuspers UAP.

Morris, L.W., Davis, M.A., & Hutchings, C.A. (1981). Cognitive and emotional components ofanxiety: Literature review and a revised worry-emotionality scale. Journal of EducationalPsychology, 73(4), 541–555.

110 Joost Meijer and Jan J. Elshout

Page 19: The predictive and discriminant validity of the zone of proximal development

Perrenet, J.C., & Groen, W. (1987). Transfertest halfweg. Euclides.Resing, W. (1990). Intelligentie en leerpotentieel. Doctoral dissertation, Vrije Universiteit.

Amsterdam: Swets & Zeitlinger.Roeleveld, J. (1994). Verschillen tussen scholen. Kenmerken, effectiviteit en stabiliteit van

onderwijsinstellingen in Nederland. Doctoral dissertation, Universiteit van Amsterdam.Sarason, I.G. (1980). Introduction to the study of test anxiety. In I.G. Sarason (Ed.), Test anxiety:

Theory, research, and applications (pp. 3–14). Hillsdale, NJ: Lawrence Erlbaum.Schneider Lidz, C. (1987). Dynamic assessment. A interactional approach to evaluating learning

potential. New York: Guilford Press.Smagorinsky, P. (1995). The social construction of data: Methodological problems of investigat-

ing learning in the zone of proximal development. Review of Educational Research, 65(3),191–212.

Spielberger, C.D. (1966). Anxiety and behavior. New York: Academic Press.Spielberger, C.D. (1975). Anxiety: State-trait-process. In C.D. Spielberger & I.W. Sarason

(Eds.), Stress and anxiety (pp. 115–144). Washington, DC: Hemisphere.Vygotsky, L.S. (1963). Learning and mental development at school age. In B. Simon & J. Simon

(Eds.), Educational psychology in the USSR (pp. 21–34). London: Routledge & Kegan Paul.Vygotsky, L.S., Cole, M. John-Steiner V. Scribner S., & Souberman, E. (1978). Mind in society.

The development of higher psychological processes. Cambridge, MA: Harvard UniversityPress.

Wertsch, J.V. (1984). Culture, communication and cognition: Vygotskian perspective. New York:Cambridge University Press.

Received 12 August 1999; revised version received 27 May 2000

Appendix 1

Structural equation model for analysing the central hypothesis

In the Vygotskyan as well as in the restricted model, 20 observed variables are involved.Repeated measurements of emotionality, worry and lack of self-con�dence are conceived of asvariables that each indicate a distinct aspect of test anxiety. Mathematics performance level wasassessed twice, �rst in grade three and approximately nine months later in grade four, both timesusing a conventional test as well as a learning test that contained optional help. Students thustook four mathematics tests in total, and correspondingly test anxiety was measured four times(see Table 1). Since every test anxiety measurement comprised the three aspects of emotionality,worry and lack of self-con�dence on each occasion, 12 of the 20 observed variables in the modelsconcern test anxiety.

Instead of using the four observed scores on the mathematics tests, sumscores were calculatedfor two randomly chosen subsets of items for each mathematics test (labelled A and B in Figure4), resulting in eight observed variables representing mathematics achievement on four differentoccasions. The factor loadings in the measurement model are given as l ijk’ where i = x or y, x if thevariable is an independent variable, y if a dependent variable is concerned, j = 1..10, depending onthe number whereby the observed variable is designated, and k = 1..4, depending on the numberof the latent variable or factor under consideration. The vectors u e (e 1 to e 10) and u d ( d 1 to d 10)respectively contain the error variances of the dependent observed and the independentobserved variables. There are four latent variables or factors in the model for the �rst episode ofthe longitudinal experiment. Two of these factors concern test anxiety, respectively before takingthe conventional mathematics pretest and before taking the mathematics learning pretest. Thethird factor represents mathematical ability and the fourth factor represents mathematicslearning potential. All observed emotionality, worry and lack of self-con�dence scores areassumed to load on the corresponding test anxiety factors. For example, emotionality, worry andlack of self-con�dence ratings reported by participants before the mathematics learning posttestin grade four are assumed to load on the factor designated by test anxiety before learning (L) test2 (parameters l y42’ l y52 and l y62). All split-part measures of mathematics performance are also

111Validity of learning potential measures

Page 20: The predictive and discriminant validity of the zone of proximal development

regressed on their corresponding factors, representing respectively mathematical ability in gradethree (mathematical ability 1) and grade four (mathematical ability 2). Apart from these factorloadings, it is also assumed in the model that both performance measures on the conventionalmathematics pretest (C-test 1 performance A and B) load on the factor designated as test anxietybefore conventional (C) test 1 (parameters l x71 and l x81). Likewise, it is assumed that bothperformance measures on the mathematics learning pretest (L-test 1 performance A and B) loadon the test anxiety factor before learning (L) test 1 (parameters l x92 and l x10–2). A similar patternof factor loadings is assumed to apply to the mathematics posttests (parameters l y71’ l y81’ l y92’ andl y10–2). This pattern of factor loadings in the model is a representation of the idea that themathematics tests measure mathematical ability as well as test anxiety.

The structural part of the model re�ects the assumption that there is no correlation betweentest anxiety and mathematical ability on the latent level. That is to say, the correlation betweentest anxiety before conventional test 1 and mathematical ability 1 (w 31) as well as the correlationbetween test anxiety before learning test 1 and mathematical ability 1 (w 32) are constrained tozero. Parameter w 21 represents the disattenuated correlation between test anxiety before takingthe �rst conventional test and test anxiety before taking the �rst learning test. Test anxiety factorscores associated with the mathematics posttests are regressed on test anxiety factor scores,associated with the mathematics pretests. In the model, these regression coef�cents are repre-sented by the parameters g 11’ g 21’ g 12 and g 22. Parameters § 1 and § 2 represent the unexplainedvariances of the dependent test anxiety factors, i.e., the proportions of variance of posttest testanxiety factor scores that cannot be predicted by the variance of the pretest test anxiety factorscores. Furthermore, the residuals of the predicated dependent test anxiety factor scores areassumed to correlate (parameter § 21). Parameter § 21 represents the covariance between both testanxiety factors on the posttest occasions that cannot be explained by both test anxiety factors onthe pretest occasions. The paths leading from the lack of self-con�dence variables before theconventional and mathematics learning posttests to the mathematical ability factor in grade four(parameters v 33 and v 36) represent a non-standard feature of the model. These parametersrepresent the negative in�uence of lack of self-con�dence on mathematical ability in the secondphase of the longitudinal experiment. Since the test anxiety questionnaires were administered

Figure 4. Structural model as implied by Vygotsky’s theory

112 Joost Meijer and Jan J. Elshout

Page 21: The predictive and discriminant validity of the zone of proximal development

before the mathematics tests, it is possible that lack of self-con�dence directly worsens mathe-matics performance. Pilot experiments had in fact rendered results supporting this conjecture(Meijer, 1996).

113Validity of learning potential measures