SCHOOL FACTORS EXPLAINING STUDENT ACHIEVEMENT
Transcript of SCHOOL FACTORS EXPLAINING STUDENT ACHIEVEMENT
1
SCHOOL FACTORS EXPLAINING STUDENT ACHIEVEMENT: TESTING THE DYNAMIC MODEL OF EDUCATIONAL EFFECTIVENESS
B.P.M. CREEMERS 1 & L. KYRIAKIDES 2 Faculty of Behavioural and Social Sciences, University of Groningen, The Netherlands 1
Department of Education, University of Cyprus, Cyprus 2
ABSTRACT
This paper presents results of a longitudinal study in which 50 schools, 108 classes and 2369 Cypriot
pupils participated. The study provides evidence about the validity of the dynamic model which: a) is
multilevel in nature, b) is based on the assumption that the relation of some factors with achievement
may be curvilinear, and c) defines relations among the effectiveness factors. Each factor is measured
by taking into account five dimensions: frequency, focus, stage, quality and differentiation. The paper
refers to the methods used to test the model at the school level by measuring school effectiveness in
mathematics, language, and religious education. The findings of the study are presented. Implications
for the development of the dynamic model are drawn.
INTRODUCTION
The most important criticism of Educational Effectiveness Research (EER) is that there is a shortage
of rational models from which researchers can build theory. The problem is aggravated by infrequent
use of whatever models exist (Bosker & Scheerens, 1994). In this context, a dynamic model of EER
has recently been developed (Creemers & Kyriakides, 2007). The essential characteristics of the
proposed dynamic model are as follows. First, the model belongs to the integrated approach to
educational effectiveness modeling (Scheerens & Bosker, 1997) since it refers to multiple factors of
effectiveness which operate at different levels. It is, therefore, multi-level in nature. Second, it is
expected that some factors at the same level are related to each other. It is, therefore, considered
important to specify groupings of factors. Third, although there are different factors and grouping of
factors, it is assumed that each factor can be defined and measured using five dimensions: frequency,
focus, stage, quality, and differentiation. This is a way to consider each factor as a multidimensional
construct and at the same time to be in line with the parsimonious nature of the model. Finally, the
2
model is designed in a way that takes into account the possibility that a non-linear relationship
between some factors and the outcomes may exist. This refers to the possibility of searching for
optimal values of the various dimensions of the factors and optimal combinations between factors.
A criticism that may arise from the theoretical background and the outline of the dynamic
model concerns the complexity of the model and the difficulties of testing the model empirically. For
example, it can be claimed that the model is not parsimonious since it contains more factors and more
dimensions than previous models and it is therefore not possible to illustrate priorities for educational
improvement. Moreover, the inclusion of different dimensions for measuring each factor complicates
the data collection and the analysis. However, the results of the first phase of a longitudinal study
investigating the validity of the model at the classroom level reveal that the dynamic model is a
theoretical model that can be put into testing (Kyriakides & Creemers, 2006). Moreover, the results of
this study provided support for the construct validity of the five measurement dimensions of most
effectiveness factors at the classroom level. This might reveal a weakness of previous effectiveness
studies focused on classroom level which usually treated frequency as the only measurement
dimension of effectiveness factors. Furthermore, this study revealed the added value of using five
dimensions to measure the classroom level factors for explaining variation of student achievement
gains in different outcomes. Testing the validity of the model at the classroom level can be seen as the
starting point for the development and the testing of the dynamic model at the school and the system
levels. The second phase of this longitudinal study attempts to provide empirical evidence of the
model at the school level. Thus, this paper refers to the school level factors of the dynamic model and
presents the main results of the second phase of the longitudinal study investigating the validity of the
dynamic model.
THE DYNAMIC MODEL: FACTORS OPERATING AT THE SCHOOL LEVEL
The definition of the school level is based on the assumption that factors at the school level are
expected to have not only direct effects on student achievement but also mainly indirect effects.
School factors are expected to influence classroom-level factors, especially the teaching practice. This
assumption is based on the fact that EER studies show that the classroom level is more significant
3
than the school and the system level (e.g., Kyriakides et al., 2000; Teddlie & Reynolds, 2000; Yair,
1997) and that defining factors at the classroom level is seen as a prerequisite for defining the school
and the system level (Creemers, 1994). Therefore, the dynamic model refers to factors at the school
level which are related to the same key concepts of quantity of teaching, provision of learning
opportunities, and quality of teaching which were used to define classroom-level factors (see
Creemers & Kyriakides, 2006). Specifically, emphasis is given to the following two main aspects of
the school policy which affect learning at both the level of teachers and students: a) school policy for
teaching and b) school policy for creating a learning environment at school. Guidelines are seen as
one of the main indications of school policy and this is reflected in the way each school level factor is
defined (see Creemers & Kyriakides, 2007). However, in using the term guidelines we refer to a range
of documents, such as staff meeting minutes, announcements, and action plans, which make the
policy of the school more concrete to the teachers and other stakeholders. It should also be
acknowledged that this factor does not imply that each school should simply develop formal
documents to install the policy. The factors concerned with the school policy mainly refer to the
actions taken by the school to help teachers and other stakeholders have a clear understanding of what
is expected from them to do. Support offered to teachers and other stakeholders to implement the
school policy is also an aspect of these two overarching factors. The term policy is also used in a
same way to describe a relevant overarching factor at the context level concerned with the
national/regional educational policy (see Creemers & Kyriakides, 2007).
Based on the assumption that the essence of a successful organization in the modern world is the
search for improvement (Barber, 1986; Kyriakides & Campbell, 2004), we also examine the
processes and the activities which take place in the school in order to improve the teaching practice
and its learning environment. For this reason, the processes which are used to evaluate the school
policy for teaching and the learning environment of the school are investigated. Thus, the following
four overarching factors at the school level are included in the model:
a. school policy for teaching and actions taken for improving teaching practice,
b. evaluation of school policy for teaching and of actions taken to improve teaching,
4
c. policy for creating a school learning environment and actions taken for improving the school
learning environment, and
d. evaluation of the school learning environment
It is important to note that leadership is not considered as a school-level factor. This can be attributed
to the fact that three current meta-analyses of studies investigating the impact of the principal’s
leadership on student achievement confirm earlier research findings on the limitations of the direct
effects approach to linking leadership with student achievement (Creemers, Kyriakides, Antoniou, &
Demetriou, 2007; Scheerens, Seidel, Witziers, Hendriks, & Doornekamp, 2005?; Witziers, Bosker, &
Kruger, 2003). Similar results are obtained from the few studies which were conducted in order to
measure indirect effects of leadership on student achievement (Leithwood & Jantzi, 2006). Therefore,
the model is not concerned with who is in charge of designing and/or implementing the school policy
but with the content of the school policy and the type of activities that take place in school. This
reveals one of the major assumptions of the model which is not focused on individuals as such but on
the effects of the actions which take place at classroom/school/context levels. This holds for the
students, teachers, principals and policy makers. Our decision is also consistent with the way
classroom level factors are measured since instead of measuring the teaching style of the teacher, we
are focused on the actual behavior of teacher in the classroom. Similarly, instead of measuring the
leadership style of a principle we look at the impact of the end result of leadership (e.g., the
development of school policy on teaching or the evaluation of school policy). As far as the context
level factors are concerned, the dynamic model does not refer to the leadership style of policy makers
or to the use of specific approaches in administering the system but it refers to the content of the
national policy which reveals the end result of the activities that policy makers undertake. A brief
description of each overarching school factor of the dynamic model is provided below.
A) School policy for teaching and actions taken for improving teaching
Since the definition of the dynamic model at the classroom level refers to factors related to the key
concepts of quality, time on task, and opportunity to learn, the proposed model attempts to investigate
5
aspects of school policy for teaching associated with quantity of teaching, provision of learning
opportunities, and quality of teaching (Creemers & Kyriakides, 2007). Actions taken for improving
the above three aspects of teaching practice, such as the provision of support to teachers for
improving their generic teaching sills, are also taken into account. More specifically, the following
aspects of school policy on quantity of teaching are taken into account:
• school policy on the management of teaching time (e.g., lessons start on time and finish on
time; there are no interruptions of lessons for staff meetings and/or for preparation of school
festivals and other events),
• policy on student and teacher absenteeism,
• policy on homework, and
• policy on lesson schedule and timetable.
School policy on provision of learning opportunities is measured by looking at the extent to which the
school has a mission concerning the provision of learning opportunities which is reflected in its policy
on curriculum. We also examine school policy on long-term and short-term planning and school
policy on providing support to students with special needs. Furthermore, the extent to which the
school attempts to make good use of school trips and other extra-curricular activities for
teaching/learning purposes is investigated. Finally, school policy on the quality of teaching is seen as
closely related to the eight classroom-level factors of the dynamic model, which refer to the
instructional role of teachers.
Therefore, the way school policy for teaching is examined reveals that effective schools are
expected to make decisions on maximizing the use of teaching time and the learning opportunities
offered to their students. In addition, effective schools are expected to support their teachers in their
attempt to help students learn by using effective teaching practices, as these are defined by the
classroom-level factors of the model. In this context, the definition of the first overarching school-
level factor is such that we can identify the extent to which: a) the school makes sure that teaching
time is offered to students, b) learning opportunities beyond those offered by the official curricula are
6
offered to the students, and c) the school attempts to improve the quality of teaching practice.
Therefore, we measure the impact of the school on the three major constructs of effectiveness
research concerned with time on task, opportunity to learn, and quality of teaching.
B) Evaluation of school policy for teaching and of actions taken to improve teaching
Creemers (1994) claims that control is one of the major principles operating in generating educational
effectiveness. This implies that goal attainment and the school climate should be evaluated. Since
studies investigating the validity of the model provided empirical support for the importance of this
principle (e.g., de Jong et al., 2004; Kyriakides et al., 2000; Kyriakides, 2005), it was decided to treat
evaluation of policy for teaching and of other actions taken to improve teaching practice as an
overarching factor operating at school level. Thus, the measurement dimensions of this factor are
briefly described below.
Frequency: First, frequency is measured by investigating how many times during the school
year the school collects evaluation data concerning its own policy for teaching and the actions taken
for improving teaching. Emphasis is also given to the sources of evaluation data which are used. This
is attributed to the fact that studies on school evaluation reveal that evaluators should employ a
multidimensional approach in collecting data on school and teacher effectiveness (e.g., Danielson &
McGreal, 2000; Johnson, 1997; Kyriakides & Campbell, 2004; Nevo, 1995), as comparisons of
various sources might increase the internal validity of the evaluation system (Cronbach, 1990).
Moreover, the involvement of all constituencies in the evaluation process may foster participatory
policies that result in less stakeholder criticism of the evaluation system (Patton, 1991; van den Berg,
& Ros, 1999). This argument is also in line with the fact that EER revealed that multisource
assessments that tap the collective wisdom of supervisors, peers, students, parents, and others provide
the opportunity to more effectively improve teaching and document its quality (Wilkerson, Manatt,
Rogers, & Maughan, 2000). Thus, these two indicators of the frequency dimension help us identify
the extent to which a systematic evaluation of school policy for teaching and of actions taken to
improve teaching takes place.
7
Focus: The focus dimension refers to the aspects of the school policy for teaching which are
evaluated. Evaluation of school policy for teaching could refer to the properties of the school policy
(e.g., clear, concrete, in line with the literature), its relevance to the problems which teachers and
students have to face, and its impact on school practice and student outcomes (Kyriakides et al.,
2006). It also is examined whether each school evaluates not only the content of the policy for
teaching and of the actions taken to improve teaching practice but also the abilities of people who are
expected to implement the policy. Moreover, the specificity aspect of the focus dimension is
measured by looking at the extent to which information gathered from the evaluation is too specific
(e.g., teacher X cannot do this) or too general (teachers are not able to teach effectively). The relation
between student outcomes and specificity of evaluation is expected to be curvilinear. Research on
school self-evaluation reveals that data collected should not be too specific and apportion blame to
any individual for the fact that the school is not particularly effective; such an approach serves the
summative purpose of evaluation and does not help the schools to take decisions on how to improve
their policy (e.g., Fitz-Gibbon, 1996; Hopkins, 1989; Patton, 1991; Visscher & Coe, 2002). At the
same time, information gathered from evaluation should not be too general but should be focused on
how to influence decision-making, especially the process of allocating responsibilities to school
partners in order to introduce a plan for improving the effectiveness of their school (Kyriakides &
Campbell, 2004; Macbeath, 1999; Meuret & Morlaix, 2003). Finally, focus is examined by
investigating the purposes for which the evaluation data are collected, especially whether evaluation
is conducted for formative or summative reasons (Black & Wiliam, 1998). Studies on EER reveal that
effective schools are those which use evaluation data for formative reasons (e.g., Harlen & James,
1997; Kyriakides, 2005; Scheerens & Bosker, 1997; Teddlie & Reynolds, 2000; Worthen, Sanders, &
Fitzpatrick, 1997).
Stage: The stage dimension of this factor is examined by looking at the period in which
evaluation data are collected. Schools could either conduct evaluation at the end of certain periods
(e.g., end of semester) or establish evaluation mechanisms which operate on a continuous basis during
the whole school year. The dynamic model is based on the assumption that a continuous model of
school evaluation is needed in order to allow schools to adopt their policy decisions on the needs of
8
different groups of school stakeholders (Hopkins, 2001; Jordan, 1977; Kyriakides, 2004). This
assumption is also in line with the main principles upon which the comprehensive model of
educational effectiveness is based (Creemers, 1994). We also expect the schools to review their own
evaluation mechanisms and adapt them in order to collect appropriate and useful data (see also
Cousins & Earl, 1992; Torres & Preskill, 2001; Preskill et al., 2003; Thomas, 2001).
Quality: Quality is measured by looking at the psychometric properties of the instruments
(i.e., reliable, valid, useful) used to collect data on school policy of teaching and actions taken to
improve teaching (Cronbach, 1990; Kane, 2001). We emphasize here that validity is a critically
important issue, and for this reason, we discuss below how schools could deal with this important
element of their evaluation policy in order to increase their effectiveness. The term “validity” denotes
the scientific utility of a measuring instrument, broadly statable in terms of how well it measures what
it purports to measure (Nunnally & Bernstein, 1994). Therefore, the quality of the evaluation factor is
measured by specifying how well each evaluation instrument meets the standards by which it is
judged. However, contemporary discussion of validity emphasizes two important precepts that are
relatively recent in the evolution of validity theory. First, Madaus and Pullin (1991) argue that
evaluation instruments do not have universal validity; they are valid only for specific purposes.
Moreover, Sax (1997) claims that validity is defined as the extent to which measurements are useful
in making decisions and providing explanations relevant to a given purpose. To the extent that
measurements fail to improve effective decision-making by providing misleading or irrelevant
information, they are invalid. No matter how reliable they are, measurements lack utility if they are
not valid for some desired purpose. In this context, we argue that more emphasis should be given to
the interpretive validity of the instruments rather than to their traditional forms of validity, such as the
construct and content validity of the instruments. The interpretation should be validated and not the
test or the test score. Thus, the measurement of the quality of this factor is expected to include an
evaluation of the consequences of test uses, and proposed uses should be justified by illustrating that
the positive consequences outweigh the anticipated negative consequences (AERA, APA, & NCME,
1999, 1.19-1.25). This implies that the measure of the quality of the evaluation of school policy on
9
teaching is seen as an integrated evaluation of the interpretation of the school evaluation mechanisms
rather than as a collection of techniques.
Differentiation: Finally, the differentiation dimension is measured by looking at the extent to
which the school gives more emphasis on conducting evaluation for specific aspects/reasons of policy
for teaching which refer to the major weaknesses of the school. For example, if policy on homework
is considered problematic, the school may decide to collect data for homework more often and in
greater depth rather than collecting data for any other aspect of school policy for teaching.
C) School Policy for creating a School Learning Environment (SLE) and actions taken for
improving the SLE
School climate factors have been incorporated in effectiveness models in different ways. Stringfield
(1994) defines the school climate very broadly as the total environment of the school. This makes it
difficult to study specific factors of the school climate and examine their impact on student
achievement (Creemers & Reezigt, 1999b). On the other hand, Creemers (1994) defines climate
factors more narrowly and expects them to exert influence on student outcomes in the same way as
the effectiveness factors do. The proposed dynamic model refers to the extent to which a learning
environment has been created in the school. This element of school climate is seen as the most
important predictor of school effectiveness since learning is the key function of a school. Moreover,
EER has shown that effective schools are able to respond to the learning needs of both teachers and
students and to be involved in systematic changes of the school’s internal processes in order to
achieve educational goals more effectively in conditions of uncertainty (Harris, 2001). In this context,
the following five aspects which define the learning environment of the school are taken into account:
a) student behavior outside the classroom,
b) collaboration and interaction between teachers,
c) partnership policy (i.e., the relations of school with community, parents, and
advisors),
d) provision of sufficient learning resources to students and teachers, and
e) values in favor of learning.
10
The first three aspects refer to the rules which the school has developed for establishing a learning
environment inside and outside the classrooms. Here the term learning does not refer exclusively to
the student learning. For example, collaboration and interaction between teachers may contribute in
their professional development (i.e., learning of teachers) but may also have an effect on teaching
practice and thereby improve student learning. The fourth one refers to the policy on providing
resources for learning. The availability of learning resources in schools may not have only an effect
on student learning but may also encourage the learning of teachers. For example, the availability of
computers and software for teaching Geometry may contribute to teacher professional development
since it encourages teachers to find ways to make good use of the software in their teaching practice
and thereby to become more effective. The last aspect of this overarching factor is concerned with the
strategies which the school has developed in order to encourage teachers and students to develop
positive attitudes towards learning. The fact that the importance of the school climate is only seen in
relation to the extent to which there is a learning environment within the school implies that values of
the people not related with learning are not seen as effectiveness factors but may be related with the
outcomes of schooling.
Following a similar approach as the one concerned with school policy on teaching, the proposed
dynamic model attempts to measure the school policy for creating a School Learning Environment
(SLE). Actions taken for improving the SLE beyond the establishment of policy guidelines are also
taken into account. More specifically, actions taken for improving the SLE can either be directed at:
a) changing the rules in relation to the first three aspects of the SLE factor mentioned above, b)
providing educational resources (e.g., teaching aids, educational assistance, new posts), or c) helping
students/teachers to develop positive attitudes towards learning. For example, a school may have a
policy promoting teacher professional development, but this might not be enough, especially if some
teachers do not consider professional development as an important issue. In this case, actions should
be taken to help teachers develop positive attitudes towards learning, which may help them become
more effective.
11
D) Evaluation of the school learning environment
Since school climate is expected to be evaluated (Creemers, 1994), the dynamic model also refers to
the extent to which a school attempts to evaluate its learning environment. A similar approach to the
one used to measure the school-level factor concerning the evaluation of school policy of teaching is
used in order to measure the factor focused on the evaluation of the school learning environment (see
Creemers & Kyriakides, 2007).
Beyond presenting the proposed model at the school level some supportive material for the
validity of the dynamic model is provided in this paper. Our attempt to do this is due to the fact that
many theories die, not because of any demonstrated lack of merit, but because even their creators
failed to provide any evidence at all supporting even some of the ideas included in their theory. Thus,
the next section refers to the methods used to test the validity of the dynamic model whereas the
fourth part of the paper illustrates the results of the second phase of a study conducted in Cyprus in
order to test the validity of the model at the school level. Finally, implications of findings for the
development of educational effectiveness research are drawn in the last section of this paper.
METHODS
The studies which have been used in order to test the validity of Creemers’ model (de Jong et al.,
2004; Kyriakides, 2005; Kyriakides et al., 2000; Kyriakides & Tsangaridou, 2004) reveal the
importance of using multiple measures of effectiveness factors and of conducting longitudinal studies
rather than case studies in order to identify the relations which exist between the various measures of
each factor and student achievement gains. Thus, the longitudinal study which is undertaken in
Cyprus does not only attempt to investigate educational effectiveness in mathematics and language
but measures concerning with the main aims of religious education are also taken into account. As a
consequence, the extent to which the dynamic model can be considered as a generic model can be
tested. Specifically, the second phase of the study attempts to identify:
a) the extent to which each of the school level factors can be defined by reference to the
five dimensions of the model, and
12
b) the type(s) of relations that each factor and its dimensions have with student learning
outcomes in mathematics, language and religious education.
A) Participants
Stratified sampling (Cohen, Manion, & Morrison, 2000) was used to select 52 out of 191 Cypriot
primary schools but only 50 schools participated in the study. All the year 5 pupils (n=2503) from
each class (n=108) of the school sample were chosen. The chi-square test did not reveal any
statistically significant difference between the research sample and the population in terms of pupils’
sex. Moreover, the t-test did not reveal any statistically significant difference between the research
sample and the population in terms of the size of class and of the length of teaching experience of the
teacher sample. It may be claimed that a nationally representative sample of Cypriot year 5 pupils was
drawn.
As far as the dependent variables of this study are concerned, data on pupils’ achievement in
mathematics, Greek language and religious education were collected by using external forms of
assessment. Written tests were administered to our student sample when they were at the beginning of
year 5 (i.e., October 2004), at the end of year 5 (i.e., May 2005), and at the end of year 6 (i.e., May
2006). Since this paper investigates the extent to which school level factors explain variation on
student achievement gains during the second year of this longitudinal study, information on the
relevant dependent and explanatory variables are provided below. It is however important to note
here that data on achievement both at the end of year 5 and at the end of year 6 were available for
2369 out of the 2503 students. This means that our missing cases were less than 7% of the whole
sample of students.
B) Dependent Variables: Student achievement in mathematics, Greek language and religious
education
As far as the dependent variables of this study are concerned, data on student achievement in
mathematics, Greek language and religious education were collected by using external forms of
assessment designed to assess knowledge and skills in mathematics, Greek language and religious
13
education which are identified in the Cyprus Curriculum for year 6 students (Ministry of Education,
1994). Student achievement in relation to the affective aims included in the Cyprus curriculum for
religious education was also measured. Criterion-reference tests are more appropriate than norm-
referenced tests for relating achievement to what a pupil should know and for testing competence
rather than general ability. Thus, criterion-reference tests were constructed and pupils were asked to
answer at least two different tasks related to each objective in the teaching program of mathematics,
Greek language, and religious education for year 6 pupils. Scoring rubrics, used to differentiate
among four levels of task proficiency (0-3) on each task, were also constructed. Thus, ordinal data
about the extent to which each child had acquired each skill included in the year 6 curriculum of
mathematics, Greek language, and religious education were collected. The three written tests in
mathematics, Greek language and religious education were administered to the students of our sample
at the end of school year 2005-2006. The construction of the tests was subject to controls for
reliability and validity. Specifically, the Extended Logistic Model of Rasch (Andrich, 1988) was used
to analyze the emerging data in each subject separately. Four scales, which refer to student knowledge
in mathematics, Greek language and religious education and to student attitudes towards religious
education were created and analyzed for reliability, fit to the model, meaning and validity. Analysis of
the data revealed that each scale had relatively satisfactory psychometric properties. Specifically, for
each scale the indices of cases (i.e., students) and item separation were higher than 0.84 indicating
that the separability of each scale was satisfactory (Wright, 1985). Moreover, the infit mean squares
and the outfit mean squares of each scale were near one and the values of the infit t-scores and the
outfit t-scores were approximately zero. Furthermore, each analysis revealed that all items had item
infit with the range 0.83 to 1.20. It can therefore be claimed that each analysis revealed that there was
a good fit to the model (Keeves & Alagumalai, 1999). Thus, for each student four different scores for
his/her achievement at the end of year 6 were generated, by calculating the relevant Rasch person
estimate in each scale.
C) Explanatory variables at student level
14
Aptitude
Aptitude refers to the degree in which a student is able to perform the next learning task. For the
purpose of this study, it consists of prior knowledge of each subject (i.e. mathematics, Greek language
and religious education) and prior attitudes towards religious education emerged from student
responses to the external forms of assessment administered to students when they were at the end of
year 5. As it has been mentioned above, external forms of assessment were used to measure the
achievement of our sample when they were at the end of year 5. The Extended Logistic Model of
Rasch was used to analyze the emerging data in each subject separately and four scales, which refer to
student knowledge in mathematics, Greek language and religious education and to student attitudes
towards religious education at the end of year 5 were created. The psychometric properties of these
scales were satisfactory (see Kyriakides & Creemers, 2006). Thus, for each student four different
scores for his/her achievement at the end of year 5 were generated, by calculating the relevant Rasch
person estimate in each scale and these were treated as measures of prior knowledge for each of our
dependent variable.
Student Background Factors
Information was collected on two student background factors: sex (0=boys, 1=girls), and socio-
economic status (SES). Five SES variables were available: father’s and mother’s education level (i.e.,
graduate of a primary school, graduate of secondary school or graduate of a college/university), the
social status of father’s job, the social status of mother’s job and the economical situation of the
family. Following the classification of occupations used by the Ministry of Finance, it was possible to
classify parents’ occupation into three groups which have relatively similar sizes: occupations held by
working class (33%), occupations held by middle class (37%) and occupations held by upper-middle
class (30%). Representative parental occupations for the working class are: farmer, truck driver,
machine operator in a factory; for the middle class are: police officer, teacher, bank officer; and for
the upper-middle class are: doctor, lawyer, business executive. Relevant information for each child
was taken from the school records. Then standardized values of the above five variables were
calculated, resulting in the SES indicator.
15
D) Explanatory variables at school level: the construct validity of the measurement framework
The explanatory variables of the second phase of this longitudinal study, which refer to the four
school level factors of the dynamic model, were measured by asking all the teachers of the school
sample to complete a questionnaire. The questionnaire was designed in such a way that information
about the five dimensions of the four school-level overarching factors of the dynamic model could be
collected. A Likert scale was used to collect data on teachers’ perceptions of the school level factors.
Of the 364 teachers approached 313 responded, a response rate of 86%. The chi-square test did not
reveal any statistically significant difference between the distribution of the teacher sample which
indicates at which school each teacher works and the relevant distribution of the whole population of
the teachers of the 50 schools of our sample (X2=57.12, d.f.=49, p<.38). It can therefore be claimed
that our sample is representative to the whole population in terms of how the teachers are distributed
in each of these 50 schools. Moreover, the missing responses to each questionnaire item were very
small (i.e., less than 5%).
Since it is expected that teachers within a school view the policy of their school and the
evaluation mechanisms of their school similarly but differently from teachers in other schools, a
Generalisability study was initially conducted. It was found that for 102 out of the 110 questionnaire
items the object of measurement was the school. It is important to note that 6 out of the 8 items for
which the generalisability of the data at the level of the school is questionable had very small variance
and refer to the school policy in relation to the development of positive values towards learning. Since
only 8 items were used to collect data on teacher views about this factor it was decided to drop all the
items which refer to this factor. We also drop the data emerged from the two items which were found
not to be generalisable at the level of school and which were concerned with the focus dimension of
two other overarching factors (i.e., school policy for teaching, and evaluation of the learning
environment of the school).
Since the use of multilevel modeling techniques to investigate the relationship between
student achievement and each school level factor is largely dependent on the quality of the data
arising from the research instruments (Marcoulides & Schumacker, 1996), it was decided to examine
16
not only the generalisability but also the construct validity of the teacher questionnaire. Thus, using a
unified approach to test validation (AERA, APA and NCME, 1999; Messick, 1989), this study
provides construct related evidence of the measures of the teacher questionnaire concerned with the
four school level factors which are briefly presented below. Thus, answers to the first question of this
study are given in this section.
School policy for teaching
For each measurement dimension, exploratory factor analysis of the items of the questionnaire which
refer to the school policy of teaching was conducted. These items were expected to belong to three
different factors concerned with school policy on: a) quantity of teaching, b) provision of learning
opportunities and c) quality of teaching. However, the first eigenvalue of each analysis was at least
three times bigger than the second one. Therefore, we decided to treat items concerned with the same
dimension of school policy on teaching as items which refer to a single scale (Kline, 1994). To test
this assumption, the Extended Logistic Model of Rasch was used. It was found that each scale had
relatively good psychometric properties. Therefore, for each school, five different scores for its policy
on teaching in relation to the five measurement dimensions were generated. The scores were based on
aggregating the Rasch person (i.e., teacher) estimates of each scale at the school level. Calculating the
correlation coefficients of these five scores, it was also found that they were statistically significant at
level .001 but their values were lower than 0.40. This finding provides support to our decision to treat
each measurement dimension of school policy on teaching as separate construct (see also Cronbach,
1990).
Evaluation of school policy on teaching
The first order factor structure of the 15 items concerned with the evaluation of the school policy for
teaching was investigated to determine whether the five proposed measurement dimensions of the
dynamic model explain the variability in the items that are logically tied to each other, or whether
there is a single latent factor that can explain better the variability in the 15 items. Specifically, the
model hypothesized that: (a) the 15 item scores could be explained by five factors; (b) each would
17
have a nonzero loading on the factor (i.e., measurement dimension) it was designed to measure, and
zero loadings on all other factors; (c) the five factors would be correlated, and (d) measurement errors
would be uncorrelated. The findings of the first order factor SEM analysis generally affirmed the
assumption of the dynamic model that this school level factor could be measured in relation to each of
the five measurement dimensions. Although the scaled chi-square for the five factor structure
(X2=164.4, d.f.=80, p<.05) as expected was statistically significant, the RMSEA was 0.032 and the
CFI was 0.968 and both of them met the criteria for acceptable level of fit. Kline (1998, p. 212)
argues that “even when the theory is precise about the number of factors of a first-order model, the
researcher should determine whether the fit of a simpler, one-factor model is comparable”. Criteria fit
for a one factor model (X2=1405.4, d.f.=89, p<.001; RMSEA=0.152 and CFI=0.408) provided values
that fell outside generally accepted guidelines for model fit. Thus, a decision was made to consider
the five-factor structure as reasonable and the analysis proceeded and the parameter estimates were
calculated. Figure 1 depicts the five-factors model and presents the factor parameters estimates. All
parameter estimates were statistically significant (p<.001). The following observations arise form
figure 1.
________________________
Insert Figure 1 about here
________________________
First, the standardized factor loadings were all positive and moderately high. Their standardized
values ranged from 0.59 to 0.72 and six of them were higher than 0.70. Second, the correlations
among the five factors were positive but very low since all of them were smaller than 0.20. The low
values of the factor intercorrelations provided support for arguing the separation of the five factors in
the part of the teacher questionnaire concerned with the measurement of the evaluation of school
policy for teaching. Therefore, validation of the five-order factor structure of this part of the
questionnaire provided support to the use of item scores for making inferences about five different
measurement dimensions of this factor rather than treating it as a unidimensional construct. Thus, for
18
each school, five scores of its evaluation of the school policy of teaching were generated by
aggregating at the school level the factor scores emerged from teacher responses to the questionnaire.
School policy on the learning environment of the school
The same approach as the one used to examine teachers’ perceptions of school policy for teaching
was used to validate the factor structure of the questionnaire items concerned with the school policy
of the learning environment of the school. Specifically, for each measurement dimension, exploratory
factor analysis of the items concerned with this overarching factor was conducted. The results of these
analyses are as follows. First, in the case of the frequency dimension, a five-factor model (explaining
54% of the total variance) was derived. The five factors consisted of items which refer to the
following aspects of the learning environment of the school: 1) student behavior outside the
classroom, 2) collaboration and interaction between teachers, 3) relation of the school with parents
and the wider community, 4) relation of the school with the employers/ministry of education (e.g.,
inspectorate, pedagogical institute, advisory bodies) and 5) provision of learning resources. This
implies that the items of the factor of the dynamic model concerned with the partnership policy were
found to belong to two separate factors whereas the other three factors are identical to those included
in the dynamic model. Empirical support to the conceptualization of SLE by the dynamic model was
also provided by the analyses concerned with the quality and stage dimensions of this overarching
factor since four factors similar to those described in the dynamic model were identified (more than
52% of the total variance was explained by each of these two four-factor models). However, the
analysis of the items concerned with the focus dimension revealed that the first eigenvalue was almost
three times as big as the second eigenvalue. A similar result emerged from the factor analysis of the
items of the differentiation dimension. It was therefore decided to use the Extended Logistic Model of
Rasch and find out whether the scales of each of these two dimensions can be treated as
unidimensional. Analysis of the data revealed that each scale had relatively satisfactory psychometric
properties. However, in the case of the scale concerned with the focus dimension of this factor, we
decided to repeat the analysis without taking into account the responses of two teachers (of different
schools) who did not fit the model well since their person fit indices were very high. This decision is
19
justified by the fact that the psychometric properties of the new scale which emerged were
significantly improved. Thus, for each school, two scores concerned with the focus and the
differentiation of its policy for creating a learning environment were generated by calculating the
relevant mean scores of the Rasch person (teacher) estimate in each scale.
Evaluation of the learning environment of the school
The same procedure as the one used to analyze the data emerged from the teacher questionnaire about
the evaluation factor concerned with the school policy of teaching was also used to analyze the data
on the evaluation of the learning environment of the school. Specifically, the first order factor
structure of the 14 items concerned with the evaluation of the school learning environment was
investigated to determine whether the five proposed measurement dimensions of the dynamic model
explain the variability in the items that are logically tied to each other (i.e., refer to the same
measurement dimension), or whether there is a single latent factor that can explain better the
variability in the 14 items. Thus, this section presents results concerned with the testing of various
types of CFA models that can be used to analyze data emerged from teacher responses to the 14 items
concerned with the evaluation of the learning environment of the school. Specifically, the null model
and the five nested models are presented in Table 1. The null model (Model 1) represents the most
restrictive model, with 14 uncorrelated variables measuring the perceptions of teachers about the
evaluation of the learning environment of their school. Models 2 through 4 are first-order models, and
comparisons between the chi-squares of these models helped us evaluate the construct validity of the
part of teacher questionnaire concerned with this school-level factor. Models 5 and 6 were higher
order models tested and compared to account for the lower order baseline model.
______________________________
Insert Table 1 about here
_____________________________
The following observations arise from table 1. First, comparing the null model with model 2, we can
observe that although the overall fit of model 2 was not acceptable, it was a significant improvement
20
in chi-square compared to the null model. This result can be seen as an indication of the importance of
searching for the factor structure of the data emerged from the teacher questionnaire. Second, model 2
can be compared with models 3 and 4 to determine the best trait structure of this overarching factor
which is able to explain better the variability in the 14 questionnaire items. Model 3 represents the
five factor model which investigates whether each of the 14 items has a nonzero loading on the factor
(i.e., measurement dimension) it was designed to measure, and zero loadings on all other factors. The
five factors are also correlated but the measurement errors of these items are uncorrelated. The chi-
square difference between models 2 and 3 showed a significant decrease in chi-square and a
significant improvement over the one factor only model. Clearly, the use of different dimensions to
measure this factor is supported since their treatment as separate factors help us increase the amount
of covariation explained. On the other hand, model 4 was found to fit reasonably well and was a
significant improvement over both model 2 and model 3. This model hypothesized a structure of four
factors which refer to all but the focus dimension of the evaluation of SLE (see figure 2). Moreover,
the two items concerned with the measurement of the focus dimension were found to belong to two
other dimensions (i.e., one item is correlated with the factor representing the frequency dimension
whereas the other is associated with the quality dimension). Furthermore, one of the three items
expected to measure the stage dimension was not only found to be correlated with the stage
dimension but also with the factor measuring the quality dimension.
______________________________
Insert Figure 2 about Here
_____________________________
Third, models 5 and 6 were examined to determine if a second-order structure would explain the
lower order trait factors, as these are described in model 4, more parsimoniously. Specifically, model
5 hypothesized that the scores emerged from the 14 items could be explained by the four first-order
factors (as these appear in model 4) and one second-order factor (i.e. evaluation of SLE in general).
On the other hand, model 6 was a model with one second-order trait which refers to all dimensions
but the frequency. Moreover, the second order factor is allowed to be correlated with the frequency
21
factor. Figure 3 illustrates the structure of this model. We also tested three additional second order
models with varying factor structures, but none of them was significantly better than either model 5 or
model 6. In comparing first and second order models, a second-order model rarely fits better than a
lower order model. Because there are fewer parameters estimated in higher order models compared to
lower order models of the same measures, the degrees of freedom increase, as does the chi-square. In
this study, for each subject the fit indices of models 5 and 6 as well as a chi-square difference test
between the two models reveal that model 6 fits better than model 5. Moreover, the fit values of
model 5 do not meet the criteria for acceptable level of fit. This finding provides support for arguing
the importance of measuring each of the five dimensions of effectiveness factors separately rather
than treating them as unidimensional. Finally, the fit of the data emerged from measuring teachers’
perceptions of the evaluation of the SLE to model 6 could be treated as adequate. But although model
6 could be considered more parsimonious in explaining the interrelations among the five factors
rather than model 4, the latter model fits better to the data.
______________________________
Insert Figure 3 about Here
_____________________________
Having established the reliability and the construct validity of the data, analysis of the data was
undertaken in order to provide answers to the second question of the study. Due to the hierarchical
structure of data (i.e., students within classes, within schools), separate multilevel analyses of data
were conducted in order to examine the extent to which the variables in the dynamic model show the
expected effects upon each dependent variable (i.e., student achievement in mathematics, language
and religious education). The results of these analyses are presented in the next section.
RESULTS
Having established the construct validity of the framework used to measure the dimensions of the four
overarching school-level factors of the dynamic model, it was decided to examine the extent to which
the relevant factor scores show the expected effects upon each of the four dependent variables and
22
thereby the analyses were performed separately for each variable. Specifically, the dynamic model of
EER was tested using “MLwiN” (Goldstein et al., 1998) because the observations are interdependent
and because of multi-stage sampling since students are nested within classes and classes within
schools. The dependency has an important consequence. If students’ achievement within a class or a
school has a small range, institutional factors at class or school level may have contributed to it
(Snijders & Bosker, 1999). Thus, the first step in the analysis was to determine the variance at
individual, class and school level without explanatory variables (empty model). In subsequent steps
explanatory variables at different levels were added. Explanatory variables, except grouping
variables, were entered as Z-scores with a mean of 0 and a standard deviation of 1. This is a way of
centering around the grand mean (Bryk & Raudenbush, 1992) and yields effects that are comparable.
Thus, each effect expresses how much the dependent variable increases (or decreases in case of a
negative sign) by each additional deviation on the independent variable (Snijders & Bosker, 1999).
Grouping variables were entered as dummies with one of the groups as baseline (e.g., boys=0). The
models presented in Tables 2 and 3 were estimated without the variables that did not have a
statistically significant effect at .05 level.
______________________________
Insert Tables 2 and 3 About Here
_____________________________
A comparison of the empty models of the four outcome measures reveals that the effect of the school
and classroom was more pronounced on achievement in mathematics and Greek language rather than
in Religious Education. Moreover, the school and the teacher (classroom) effects were found to be
higher on achievement of cognitive rather than affective aims of religious education. These finding
are in line with the results of the first phase of this longitudinal study concerned with teacher and
school effects on student achievement at the end of year 5 (see Kyriakides & Creemers, 2006). It is
finally important to note that in each analysis the variance at each level reaches statistical significance
(p<.05) and this implies that MLwiN can be used to identify the explanatory variables which are
associated with achievement in each outcome of schooling (Goldstein, 2003).
23
In model 1 the context variables at student, classroom and school levels were added to the
empty model. The following observations arise from the figures of the four columns illustrating the
results of model 1 for each analysis. First, model 1 explains approximately 50% of the total variance
of student achievement in each outcome and most of the explained variance is at the student level.
However, more than 30% of the total variance remained unexplained at the student level. Second, the
likelihood statistic (X2) shows a significant change between the empty model and model 1 (p<.001)
which justifies the selection of model 1. Second, the effects of all contextual factors at student level
(i.e., SES, prior knowledge, sex) are significant but the SES was not found to be associated with
achievement of affective aims in religious education. Moreover, gender was not found to be
consistently associated with student achievement in each outcome. Girls were found to have better
results than boys in relation to each outcome but mathematics. The results concerning gender
differences in Greek language and mathematics are in line with findings of effectiveness studies
conducted in Cyprus (Kyriakides et al., 2000; Kyriakides, 2005) as well as with the results of the first
phase of this longitudinal study (see Kyriakides & Creemers, 2006). Third, prior knowledge (i.e.,
aptitude) has the strongest effect in predicting student achievement at the end of year 6. Moreover,
aptitude is the only contextual variable which had a consistent effect on student achievement when
was aggregated either at the classroom or the school level. Finally, the standard errors show that the
effect sizes of the context variables are significant and stable.
At the next step of the analysis, for each dependent variable, five different versions of model
2 were established. In each version of model 2, the scores of the school level factors which refer to the
same measurement dimension and emerged through our attempt to test the construct validity of the
teacher questionnaire were added to model 1. The fitting of these five models was tested against
model 1. The likelihood statistic (X2) reveals a significant change (p<.001) between the model 1 and
almost each version of model 2. Significant changes were not identified in only two models which
were concerned with the effect of the focus dimension of school level factors upon achievement in
religious education. This implies that variables measuring four out of the five dimensions of the
school effectiveness factors have significant effects on student achievement in all four outcomes of
schooling taken into account by this study. This approach was deliberately chosen since the
24
dimensions of the same factor are interrelated. Therefore, adding all dimensions into a single model
causes difficulties of identifying which variables have effects on student achievement. Specifically,
some variables may correlate with achievement when they are studied in isolation, but because of
multicollinearity their effects may disappear when they are studied together. It was, therefore,
considered appropriate to study the effect of each dimension of the school level factors in isolation.
The following observations arise from the figures of model 2a which refer to the impact of
the frequency dimension of the effectiveness factors on each of the four dependent variables. First, the
only factor which did not have any statistically significant effect is concerned with student behavior
outside the classroom. According to the dynamic model, student behavior outside the classroom is an
important aspect of the learning environment but no empirical support to the impact of the frequency
dimension of this factor has been provided. On the other hand, the evaluation of school policy for
teaching and the school relations with parents were found to be associated with student achievement
in each of the four dependent variables. Second, although curvilinear relations were assumed to exist
between most of the frequency factors and student achievement no such relation was identified. As far
as the figures of the models which refer to the impact of the stage dimension of the school level
factors are concerned, we can observe that the stage dimension of the two overarching factors
concerned with school evaluation are associated with each outcome measure whereas the stage
dimension of only one factor (i.e., student behavior outside the classroom) does not have any
statistically significant effect on student achievement. Moreover, the effects of the stage dimension of
the two evaluation factors were found to be stronger than the effect of any other factor. The figures of
the models 2c reveal that the focus dimension of the school level factors is very rarely associated with
achievement in any of the four dependent variables of this study. The only exception is the impact of
the focus dimension of policy for teaching upon mathematics achievement. Moreover, the focus
dimension of the two overarching factors of evaluation were expected to have a curvilinear relation
with student achievement but only the focus dimension of the evaluation of the school policy for
teaching was found to have a curvilinear relation with achievement in language. Furthermore, in the
case of religion education the two models of 2c were not found to fit better to the data than model 1.
The figures of model 2d refer to the impact of the quality dimension of each effectiveness factor upon
25
student achievement. We can observe that there is no quality measure of a school level factor which
does not have any statistically significant effect upon at least one of our school outcome measure.
Moreover, for each outcome measure, the model 2d explains more variance than any other alternative
model 2 and this reveals the importance of using this dimension to measure the impact of
effectiveness factors on student achievement. Furthermore, almost all the effect sizes of the quality
measures upon student achievement are higher than .05. Finally, the figures of the four models of 2e
reveal that the differentiation dimension of the overarching factor concerned with the school policy
for creating a learning environment is not only consistently related with student achievement but its
effect size is stronger than the effect of any other differentiation dimension of school level factors. On
the other hand, the differentiation dimension of the evaluation of school policy for teaching is not
associated with student achievement in any outcome measure.
At the next stage of the analysis, we attempted to identify the amount of variance which can
be explained when researchers take into account the effects of the frequency dimensions of the school
level factors and the effects of at least another dimension. For this reason, four alternative models
were created which took into account combination of frequency dimension with another dimension of
the school level factors. Each model was compared with model 2a which takes into account only the
frequency dimension. The likelihood statistics for each model justifies the inclusion of more than one
dimension of factors in the model. Table 4 illustrates the total explained variance of model 2a and of
five alternative models taking into account combinations of frequency with other dimensions of
measurement. We can observe that for each outcome each alternative model explains more than the
variance explained by considering only the frequency dimension. However, only two models
concerned with combination of the frequency and the focus dimensions were found to explain more
variance than model 2a. Moreover, the model with a combination of frequency with quality
dimensions of the classroom level factors explains more total variance than any other combination of
the frequency with each of the three dimensions. Finally, the model 3 combining all five dimensions
explains most of the variance. This model was found to fit better than any other alternative model. It
is important to note that this model is able to explain more than 85% of the variance at the school
level of student achievement in each outcome. This implies that all five dimensions should be taken
26
into account in order to explain as much variance as possible at the school level. However, none of
these models explains more than about 60% of the total variance. Nevertheless, this can be attributed
to the fact that only some contextual factors at the student and classroom level were taken into
account. It is therefore important to examine whether including the five dimensions of the classroom
level factors could help us explain most of the unexplained variance of model 3 for each outcome.
______________________________
Insert Table 4 About Here
_____________________________
DISCUSSION
Implications of findings for the development of the dynamic model are drawn. First,
References
Barber, M. (1996). The learning game: Arguments for an Education Revolution. London: Victor Gollanz.
Bosker, R. J. & Scheerens, J (1994). Alternative models of school effectiveness put to test. International Journal of Educational Research, 21 (2), 159-180.
Campbell, R.J., Kyriakides, L., Muijs, R.D., & Robinson, W. (2003). Differential teacher effectiveness: towards a model for research and teacher appraisal. Oxford Review of Education, 29 (3), 347-362.
Creemers, B.P.M. (1994). The effective classroom. London: Cassell.
Creemers, B.P.M. & Kyriakides, L. (2006). Critical analysis of the current approaches to modelling educational effectiveness: The importance of establishing a dynamic model School Effectiveness and School Improvement.
de Jong, R., Westerhof, K. J., & Kruiter, J.H. (2004). Empirical evidence of a comprehensive model of school effectiveness: a multilevel study in Mathematics in the first year of junior general education in the Netherlands. School effectiveness and school improvement, 15 (1), 3-31.
Heck, R.H. & Thomas, S. L. (2000). An introduction to multilevel modeling techniques. Mahwah, NJ: Lawrence Erlbaum Associates.
Kline, P. (1994). An Easy Guide to Factor Analysis. London: Routledge.
Kline, R.H. (1998). Principles and Practice of Structural Equation Modeling. London: Gilford Press.
Kyriakides, L. (2005). Extending the Comprehensive Model of Educational Effectiveness by an Empirical Investigation. School Effectiveness and School Improvement, 16.
27
Kyriakides, L. & Campbell, R.J. (2004). School self-evaluation and school improvement: a critique of values and procedures. Studies in educational evaluation, 30 (1), 23-36.
Kyriakides, L., Campbell, R.J., & Gagatsis, A. (2000). The significance of the classroom effect in primary schools: An application of Creemers comprehensive model of educational effectiveness. School Effectiveness and School Improvement, 11 (4), 501-529.
Kyriakides, L. & Creemers, B.P.M. (2006). Testing the Dynamic Model of Educational Effectiveness: Teacher Effects on Cognitive and Affective Outcomes. Paper presented at the 87th Annual Meeting of the American Educational Research Association. San Francisco, USA
Kyriakides, L. & Tsangaridou, N. (2004). School Effectiveness and Teacher Effectiveness in Physical Education. Paper presented at the 85th Annual Meeting of the American Educational Research Association. Chicago, USA.
Muthén, L. K., & Muthén, B. O. (1999). Mplus user's guide. Los Angeles, CA: Muthén & Muthén.
Scheerens, J. & Bosker, R (1997). The Foundations of Educational Effectiveness. Oxford: Pergamon
Snijders, T. & Bosker, R. (1999). Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. London: Sage.
Teddlie, C. & Reynolds, D. (2000). The International Handbook of School Effectiveness Research. London: Falmer Press.
Witziers, B., Bosker, R.J., & Kruger, M.L. (2003). Educational Leadership and Student Achievement: The Elusive Search for an Association. Educational Administration Quarterly, 39 ( 3), 398-425.
Yair, G. (1997). When classrooms matter: Implications of between-classroom variability for educational policy in Israel. Assessment in Education, 4 (2), 225-248.
28
Table 1: Goodness of fit indices for structural equation models used to test the validity of the proposed framework for measuring the evaluation of the school learning environment
SEM Models X2 d.f. CFI RMSEA X2/d.f.
1) Null model 2131.5 105 ----- ------ 20.3
2) 1 first order factor 298.7 76 .878 .13 3.93
3) 5 correlated factors 142.1 67 .901 .09 2.12
4) 4 correlated factors (see figure 2)
122.5 70 .947 .03 1.75
5) 1 second order general, 4 correlated factors
286.1 71 .921 .08 4.03
6) 2 correlated second order general, 4 correlated factors (see figure 3)
164.9 72 .936 .05 2.29
29
Table 2: Parameter Estimates and (Standard Errors) for the analyses of Greek language and of mathematics achievement
Factors Greek Language Mathematics Model 0 Model 1 Model 2a Model 2b Model 2c Model 2d Model 2e Model 0 Model 1 Model 2a Model 2b Model 2c Model 2d Model 2e
Fixed part (Intercept) -0.31(.08) -0.22(.08) -0.19(.08) -0.20(.08) -0.19(.08) -0.22(.08) -0.21(.08) 0.35 (.05) 0.28 (.05) 0.23 (.03) 0.24 (.03) 0.26 (.04) 0.20 (.03) 0.24 (.03)Student Level Prior knowledge 0.39 (.05) 0.37 (.05) 0.36 (.05) 0.35 (.05) 0.38 (.05) 0.37 (.05) 0.45 (.10) 0.40 (.10) 0.42 (.11) 0.42 (.10) 0.40 (.09) 0.38 (.09)Sex (boys=0, girls=1) 0.19 (.08) 0.18 (.08) 0.20 (.09) 0.22 (.09) 0.19 (.08) 0.20 (.08) -0.14(.06) -0.13(.05) -0.12(.05) -0.13(.06) -0.12(.05) -0.13(.06) SES 0.30 (.06) 0.28 (.05) 0.27 (.05) 0.23 (.05) 0.29 (.05) 0.27 (.05) 0.30 (.12) 0.25 (.09) 0.25 (.09) 0.21 (.08) 0.23 (.09) 0.22 (.10)Classroom Level: Context Average prior knowledge 0.12 (.05) 0.10 (.04) 0.09 (.04) 0.11 (.05) 0.09 (.04) 0.10 (.04) 0.28 (.10) 0.26 (.09) 0.25 (.10) 0.24 (.10) 0.23 (.09) 0.22 (.09)Average SES 0.08 (.03) 0.07 (.03) 0.08 (.03) 0.08 (.04) 0.07 (.03) 0.06 (.03) 0.12 (.05) 0.13 (.05) 0.10 (.04) 0.09 (.04) 0.11 (.05) 0.10 (.04)Percentage of girls N.S.S.* N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. -0.05(.02) -0.05(.02) -0.04(.02) -0.04(.02) -0.05(.02) -0.05(.02)School Level Context Average SES N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Average prior knowledge 0.09 (.04) 0.11 (.05) 0.10 (.05) 0.13 (.06) 0.11 (.05) 0.10 (.05) 0.11 (.05) 0.09 (.04) 0.08 (.04) 0.09 (.04) 0.08 (.05) 0.08 (.04)Percentage of girls N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Frequency Policy for Teaching 0.08 (.03) 0.12 (.03) Evaluation pol. for teaching 0.10 (.04) 0.09 (.02) Student behavior outside the classroom
N.S.S. N.S.S.
Collaboration / interaction among teachers
N.S.S. 0.04 (.01)
Relations with parents 0.12 (.05) 0.08 (.03) Relations with the center N.S.S. N.S.S. Provision of resources 0.06 (.02) N.S.S. Evaluation of the SLE 0.02 (.01) N.S.S. Stage Policy for teaching 0.03 (.01) 0.04 (.02) Evaluation pol. for teaching 0.10 (.01) 0.11 (.03) Student behavior outside the classroom
N.S.S. N.S.S.
Collaboration / interaction among teachers
0.04 (.02) 0.06 (.03)
Partnership N.S.S. 0.07 (.03) Provision of resources N.S.S. N.S.S. Evaluation of the SLE 0.06 (.02) 0.08 (.02)
30
Focus Policy for Teaching N.S.S. 0.04 (.02) Evaluation pol. for teaching 0.06 (.02) N.S.S. (Evaluation policy teaching)2 -0.02(.01) N.S.S. School learning environment N.S.S. N.S.S. Quality Policy for Teaching 0.07 (.02) 0.06 (.02) Evaluation pol. for teaching N.S.S. 0.05 (.02) Student behavior outside the classroom
N.S.S. 0.06 (.02)
Collaboration / interaction among teachers
N.S.S. 0.05 (.02)
Partnership 0.10 (.03) 0.08 (.02) Provision of learning resources
N.S.S. 0.06 (.02)
Evaluation of the SLE 0.06 (.02) N.S.S. Differentiation Policy for teaching
N.S.S.
N.S.S.
Evaluation pol. for teaching N.S.S. N.S.S. School learning environment 0.07 (.02) 0.09 (.03) Evaluation of SLE N.S.S. 0.08 (.02) Variance components School 9.0% 8.2% 4.5% 5.1% 6.7% 4.0% 4.6% 11.2% 9.8% 4.3% 4.5% 5.9% 4.0% 4.4% Class 14.7% 10.3% 9.8% 9.2% 10.2% 9.7% 9.9% 14.8% 10.0% 9.3% 9.6% 9.9% 9.0% 9.2% Student 76.3% 31.3% 29.3% 29.6% 30.8% 28.7% 29.5% 74.0% 30.2% 29.7% 30.0% 30.0% 29.5% 30.0% Explained 50.2% 56.4% 56.1% 52.3% 57.6% 56.0% 50.0% 56.7% 55.9% 54.2% 57.5% 56.4% Significance test X2 815.6 507.2 299.3** 322.3 471.7 276.9 364.9 1144.9 795.5 650.7 676.3 781.2 504.1 649.8 Reduction 308.4 207.9 184.9 35.5 230.3 142.3 349.4 144.8 119.2 14.3 291.4 145.7 Degrees of freedom 6 5 4 2 3 1 7 4 5 1 6 2 p-value .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001 .001
* N.S.S.=No statistically significant effect at .05 level
** For each alternative model 2 (i.e., models 2a up to 2e) the reduction is estimated in relation to the deviance of model 1.
31
Table 3: Parameter Estimates and (Standard Errors) for the analyses of achievement in Religious Education (cognitive and affective outcomes)
Factors Religious Education (Cognitive aims) Religious Education (Affective aims) Model 0 Model 1 Model 2a Model 2b Model 2c Model 2d Model 2e Model 0 Model 1 Model 2a Model 2b Model 2c Model 2d Model 2e
Fixed part (Intercept) -0.59(.11) -0.43(.09) -0.41(.08) -0.40(.08) -0.43(.09) -0.34(.08) -0.40(.08) 0.41 (.08) 0.40 (.07) 0.30 (.07) 0.31 (.07) 0.40 (.07) 0.30 (.07) 0.34 (.07)Student Level Prior knowledge 0.41 (.05) 0.39 (.05) 0.38 (.05) 0.41 (.05) 0.42 (.05) 0.40 (.05) 0.36 (.10) 0.35 (.10) 0.34 (.10) 0.36 (.10) 0.35 (.10) 0.38 (.10)Sex (boys=0, girls=1) 0.13 (.06) 0.12 (.05) 0.10 (.04) 0.13 (.06) 0.11 (.04) 0.10 (.05) 0.16 (.06) 0.15 (.06) 0.15 (.06) 0.16 (.06) 0.17 (.06) 0.15 (.06)SES 0.12 (.05) 0.10 (.05) 0.09 (.04) 0.12 (.05) 0.10 (.05) 0.08 (.04) N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Classroom Level: Context Average prior knowledge 0.15 (.06) 0.14 (.06) 0.13 (.06) 0.15 (.06) 0.12 (.05) 0.13 (.06) 0.19 (.08) 0.17 (.07) 0.16 (.07) 0.19 (.08) 0.18 (.07) 0.19 (.18)Average SES 0.09 (.04) 0.08 (.04) 0.09 (.04) 0.09 (.04) 0.07 (.03) 0.06 (.03) N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Percentage of girls N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. 0.05 (.02) 0.04 (.02) 0.04 (.02) 0.05 (.02) 0.04 (.02) 0.03 (.01)School Level Context Average SES N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Average prior knowledge 0.13 (.05) 0.13 (.05) 0.12 (.05) 0.13 (.05) 0.12 (.05) 0.13 (.05) 0.07 (.02) 0.06 (.02) 0.06 (.02) 0.07 (.02) 0.07 (.02) 0.06 (.02)Percentage of girls N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. N.S.S. Frequency Policy for Teaching 0.09 (.04) N.S.S. Evaluation pol. for teaching 0.09 (.04) 0.11 (.02) Student behavior outside the classroom
N.S.S. N.S.S.
Collaboration / interaction among teachers
N.S.S. N.S.S.
Relations with parents 0.12 (.05) 0.09 (.03) Relations with the center N.S.S. 0.04 (.02) Provision of resources 0.05 (.02) N.S.S. Evaluation of the SLE 0.05 (.02) N.S.S. Stage Policy for teaching N.S.S. N.S.S. Evaluation pol. for teaching 0.09 (.02) 0.10 (.03) Student behavior outside the classroom
N.S.S. N.S.S.
Collaboration / interaction among teachers
0.06 (.02) N.S.S.
Partnership N.S.S. 0.08 (.03) Provision of resources 0.05 (.02) N.S.S. Evaluation of the SLE 0.08 (.02) 0.09 (.03)
32
Focus Policy for Teaching N.S.S. N.S.S. Evaluation pol. for teaching N.S.S. N.S.S. School learning environment N.S.S. N.S.S. Quality Policy for Teaching 0.07 (.02) 0.06 (.02) Evaluation pol. for teaching N.S.S. 0.07 (.02) Student behavior outside the classroom
N.S.S. 0.06 (.02)
Collaboration / interaction among teachers
0.08 (.03) N.S.S.
Partnership 0.10 (.03) 0.06 (.02) Provision of learning resources
N.S.S. N.S.S.
Evaluation of the SLE 0.06 (.02) N.S.S. Differentiation Policy for teaching
0.03 (.01)
N.S.S.
Evaluation pol. for teaching N.S.S. N.S.S. School learning environment 0.08 (.02) 0.09 (.02) Evaluation of SLE N.S.S. 0.07 (.02) Variance components School 8.0% 7.2% 5.1% 5.0% 7.2% 4.5% 4.6% 7.0% 6.9% 4.7% 4.6% 7.0% 4.2% 4.6% Class 13.7% 12.9% 12.4% 12.2% 12.9% 12.3% 12.7% 10.2% 9.4% 8.8% 8.9% 9.3% 8.8% 8.5% Student 78.3% 31.2% 30.3% 29.8% 31.2% 29.2% 30.3% 82.7% 32.7% 31.9% 32.3% 32.7% 31.6% 32.0% Explained 48.7% 52.2% 53.0% 48.7% 54.0% 52.4% 51.0% 54.6% 54.2% 51.0% 55.4% 54.9% Significance test X2 985.6 676.7 495.8** 487.3 676.7*** 457.4 491.5 1024.3 684.9 488.9 495.7 684.9*** 451.4 481.4 Reduction 308.9 180.9 189.4 219.3 185.2 339.4 196.0 189.2 233.5 203.5 Degrees of freedom 6 5 4 4 2 5 3 3 4 2 p-value .001 .001 .001 .001 .001 .001 .001 .001 .001 .001
* N.S.S.=No statistically significant effect at .05 level
** For each alternative model 2 (i.e., models 2a up to 2e) the reduction is estimated in relation to the deviance of model 1.
*** Since none of the explanatory variables entered into this model had any statistically significant effect, all new variables were excluded from the model. This means that the focus dimension of the school level factors did not produce any change of the model 1.
33
Table 4: Percentage of explained variance of student achievement for each student outcome provided by each alternative model testing the effect of the frequency dimension of the school-level factors and the effect of combinations of frequency dimensions with each of the other dimensions
Alternative Models Greek Language Mathematics Cognitive Rel. Educ.
Affective Rel. Educ.
Model 2a (frequency dimension of school level factors) 56.4% 56.7% 52.2% 54.6%
Model 2f (frequency and stage dimensions) 58.5% 57.9% 56.7% 57.2%
Model 2g (frequency and focus dimensions) 57.1% 57.4% 52.2% 54.6%
Model 2h (frequency and quality dimensions) 58.9% 59.3% 57.1% 58.1%
Model 2i (frequency and differentiation dimensions) 58.3% 58.4% 56.2% 57.4%
Model 3 (all five dimensions of school level factors) 59.7% 60.8% 58.0% 58.8%