Assessing Higher Education Teachers through Peer Assistance and Review

download Assessing Higher Education Teachers through Peer Assistance and Review

of 17

Transcript of Assessing Higher Education Teachers through Peer Assistance and Review

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    1/17

    104The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    Assessing Higher Education Teachers through Peer Assistance and ReviewCarlo Magno

    De La Salle University, Manila

    AbstractThe present study advances the practice of assessing teacher performance by constructing a

    rubric that is systematically anchored on an amalgamated professional practice and learner-

    centered framework (see Magno & Sembrano, 2009). The validity and reliability of the

    rubric was determined using both classical test theory and item response theory, and

    implications for a new way of looking at the function of teacher performance assessment

    results for higher education institutions. The rubric used by fellow teachers is called the

    Peer Assistance and Review Form (PARF). The items reflect learner -centered practices

    with four domains anchored on Danielsons Components of Professional Practice

    principles: Planning and preparation, classroom environment, instruction, and professional

    responsibility. The rubric was pilot tested with 183 higher education faculty. The

    participants were observed by two raters in their class. Concordance of the two raters was

    established across the four domains (=.47, p

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    2/17

    105The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    2004; Kerchner & Koppich 1993; Bruce & Ross, 2008). Peer evaluations in

    teaching are described as involving teachers in the summative [also formative]

    evaluation of other teachers (Goldstein, 2004, p. 397). Itwas further described by

    Graves, Sulewski, Dye, Deveans, Agras, and Person (2009, p. 186) as evaluating

    ones peers allow the assessment of ones teaching by another person who has

    similar experience and goals. A more explicit definition was provided by Bruceand Ross (2008, p. 350) about peer evaluation, they described it as:

    a structured approach for building a community in which pairs of

    teachers of similar experience and competence observe each other

    teach, establish improvement goals, develop strategies to implement

    goals, observe one another during the revised teaching, and provide

    specific feedback.

    The purposes of rating teachers, such as hiring, clinical supervision, and

    modeling, are best facilitated using peer evaluations. Teachers performance from

    peer reviews should be conceptualized with the aim of helping teachers to improve

    their teaching rather than solely pointing out their mistakes (Oakland &

    Hambleton, 2006; Stiggins, 2008). It is described as a constructive process wherethe peer aims to provide assistance to a less experienced teacher in improving their

    instruction with a focus on student-teacher interaction. Blackmore (2005) reiterated

    the constructive idea of peer review where the aim of assessing teachers should

    bring about changes and improvement in the practice of teaching.

    Goldstein (2003) indicates that there is a need for extensive research in the

    area of peer assessment of teacher performance especially with regard to

    implementation issues. The present study constructed an instrument that serves the

    purpose of peer assistance and review for higher education faculty. This instrument

    will be carried out by faculty peers that serve to provide feedback for the faculty in

    higher education.

    Teachers View of Peer ReviewPeer review of teachers performance is defined and described with several

    intentions but the teachers who are constantly observed create their own views.

    These views are described in studies as thoughts and perceptions created by

    teachers as part of the process. Views were also quantitatively assessed using attitude

    scales reflecting certain components such as general attitudes and domain specific

    attitude (Wen, Tsai, & Chang 2006).

    The teachers view about their fellow teachers assessment was shown in the

    study of Kell and Annetts (2009) where they invited teaching staffs to verbalize their

    perceptions about the Peer Review of Teaching (PRT) and clarify issues. Theteaching staffs were asked to provide their personal reflections about the PRT.

    They found that giving the teaching staff ownership of the PRT makes them

    autonomous and develop flexibility in the process. In terms of rationale and

    purpose of the PRT, the staff saw it to be formative and useful for personal and

    professional development, while the newer staff viewed it as summative and audit-

    like. The ethics behind PRT included comments like lack of time and the review

    being potentially biased that they do not like to participate. The affective issues

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    3/17

    106The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    were complaints about pulling of ranks and undercurrents of power gains. On the

    other hand, the study by Atwood, Taylor, and Hutchings (2000) on the peer review

    teaching program for science professors was able to identify the barriers for the

    peer review practice. The barriers include: (1) fear, (2) uncertainty about what

    should be reviewed, and (3) how the process is reviewed.

    A more positive approach to studying peer reviews was conducted by Carter(2008). He presented useful ways for peer reviewers to enrich the peer review

    process. The tips are meant to make the review as pleasant as possible: (1)

    understand alternative views of teaching and learning; (2) prepare seriously for pre-

    visit interview; (3) listen carefully to what the students says in the pre-visit interview;

    (4) watch the students not just the teacher. The views by Carter (2008) provide

    alternative ways of implementing the peer review process that focus more on the

    constructive aspect.

    Milanowsi (2006) explained that peer review can become more constructive

    when peers discuss performance problems and suggestions (without the

    responsibility for making an administrative evaluation, evaluators will be able to

    provide more assistance toward improving performance). It is constructive whenthe function of the review is split into administrative and developmental. The

    developmental evaluation and feedback is provided by a peer mentor, while

    administrative evaluation by managers and peer evaluators, or a combined role

    group, in which developmental evaluation, feedback, and administrative evaluation

    were provided by a peer. The views of the teachers in the study about the peer

    review showed that ratees in the split role group were slightly less open to

    discussions of problems or weaknesses than those in the combined role group. The

    results of the interview showed that a larger proportion of those in the split role

    group reported being comfortable discussing their problems or weaknesses than

    those in the combined group. However, the difference is small. The study by Keig

    (2000) determined the perceptions of the faculty on several factors that mightdetract from and/or enhance their likelihood to take part in formative peer review

    of teaching. They also determined the perception of faculty how peer assessment

    might benefit the faculty, colleagues, students, and the institution. They found that

    the more the faculty is willing to participate in the peer review, the less likely they

    would become a detractor to the faculty. This indicates that the faculty who engages

    in peer reviews has good intentions for their fellow faculty.

    Effects of Peer Reviews of TeachingDifferent studies have shown that when peer reviews are intended for a

    positive and constructive approach, it can be beneficial for its intended outcomes(Bruce & Ross, 2009; Reid, 2008; Bernstein & Bass, 2000; Blackmore, 2005; Yon,

    Burnap, & Kohut, 2002; Kumrow & Dahlem, 2002). For example, an anonymous

    writer (2006) reported that when the peer assistance and review was implemented

    statewide in Canada, it reinforced the value for teaching as a highly skilled vocation,

    it helped teachers become more reflective on their teaching, and increased student

    learning as reflected through the increased SAT scores. Bruce and Ross (2008)

    found that peer reviews increased teachers efficacy. Moreover, Reid (2008) found

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    4/17

    107The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    that teachers and peers saw opportunities for developing relationships. The

    implementation by Kumrow and Dahlem (2002) reported that the number and

    quality of classroom observations exponentially increased.

    The Present StudyThe empirical studies about peer assistance and review are still not as rich as

    those about teachers performance based on students perspectives. The majority of

    the literature about peer assistance is comprised of articles or just reviews explaining

    the process and ways on how it will be implemented. The few studies completed

    report improvement in practice (Bruce & Ross, 2008; Kumrow & Dahlem, 2002),

    highlight teaching practices (Bernstein & Bass, 2000), development of framework

    for teaching (Blackmore, 2005), and autonomy of the teacher (Yon, Burnap, &

    Kohut 2002). These benefits necessitate the proper implementation of the peer

    assistance and review in a higher education setting.

    The present study constructed a rubric called Peer Assistance Review Form

    (PARF) that is applicable in the Philippine higher education institutions which alsopurports to yield the same benefits mentioned in the reviews. The rubrics validity

    and reliability were established using concordance among raters, convergence, item

    fit through the Rasch Partial credit model, and Confirmatory Factor Analysis. The

    items in the rubric are anchored on the Learner-centered principles and

    Danielsons Components of Professional Practice. The learner-centered principles

    are perspectives that allow the teachers ability to facilitate the learners in their

    learning, the learning in the programs, and other processes that involve the learner

    (Magno & Sembrano, 2007; McCombs, 1997). On the other hand, Danielsons

    Components of Professional Practice identified aspects of the teachers

    responsibilities that have been documented through empirical studies and

    theoretical research promoting improved student learning (Danielson, 1996). Theframework is divided into 22 components clustered into four domains of teaching

    (planning and preparation, classroom environment, instruction, and professional

    responsibility). The theoretical combination of the learner-centered and

    components of professional practice in a framework was discussed in the study by

    Magno and Sembrano (2009, p. 168). The amalgamation was further described as

    a combination of aspects of the teaching and learning process. More so, this

    amalgamation is representative in the assessment of the teaching and learning

    process in higher education.

    MethodParticipantsThe participants in the study were 183 randomly selected teachers in a higher

    education institution in Manila, Philippines. These teachers have finished their

    masters and doctorate degree, and some are still in progress. These teachers are

    teaching in five different major areas: Multidisciplinary studies, management and

    information technology, hotel, restaurant, and institutional management, design and

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    5/17

    108The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    arts, and deaf studies. A proportion of faculty was randomly selected for each

    school that served as ratee.

    InstrumentThe criteria used in the Peer Assistance and Review (PARF) were based on

    the four domains and the underlying components of Danielsons Components of

    Professional Practice. The descriptions for each criterion and four gradations of

    responses are also framed within the learner-centered principles. The gradations of

    the responses for each criterion were established based on the descriptions of each

    domain and components. The descriptions were confirmed and reviewed by higher

    education teachers and administrators through a focus group discussion (FGD)

    method. The faculties invited as reviewers arrived at a consensus on the rating

    categories according to its suitability of ideal teaching and facilitation of learning for

    higher education. The FGD guide was facilitated by allowing the participants to

    determine if the provided descriptors in the rubric are applicable for them, relevant

    in their teaching, phrased accordingly, produce consistent meanings for differentusers, and will have a wide variety of uses.

    The revised rubric was distributed to all teaching faculty. They judged

    whether the items are relevant for their teaching.

    A copy of the revised version of the PARF was given to experts in the field of

    performance assessment specifically for teachers performance. The reviewers were

    given one week to accomplish the task. The definitions of the components and

    purpose of the PARF were also provided so that the reviewers were guided if the

    criteria are relevant. After receiving the forms with comments, the instrument was

    revised once again.

    The instrument that was pretested was composed of 88 items under each of

    the four domains: Planning and preparation (25 items), classroom environment (21items), instruction (22 items), and professional responsibility (20 items). Each item

    is rated using an analytical rubric using a four point scale (4=exemplary,

    successful=3, limited=2, poor=1).

    ProcedureBefore the actual observation of the raters commenced, the selected faculty

    who served as raters were oriented on the process of conducting the peer assistance

    and review and how to accomplish the forms. The orientation was meant to train

    the faculty about the purpose, importance, and specific processes involved in the

    peer assistance and review. The orientation was conducted before the start of theterm. After the training, each ratee was informed about their schedule as to when

    they would be observed and rated. Each ratee was provided with a copy of the

    PARF in advance to prepare them for in the actual observation. The faculty

    members serving as ratees were informed that the purpose of the observation was

    simply to test the instrument; it would have no impact on administrative evaluation

    or salary. The observation took place within the class periods within the whole

    term. The raters visited and communicated with the ratee several times to complete

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    6/17

    109The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    evidence for the scale. These visits and meetings were conducted outside of the

    classroom. The ratee was requested to provide a syllabus and other pertinent

    documents during the period of observation for the raters reference. A detailed

    implementation guide for the observation was provided to the ratees and raters.

    The ratee, during the period of observation was requested to refrain from

    giving exams, writing activities, group works, reporting, etc. that would consume theentire period. This was to ensure that there would be some teaching samples to be

    observed and rated.

    In the observation period, there were two raters for each faculty: The primary

    and secondary rater. This procedure was done to establish the concordance of the

    ratings. If there was no common time among the three raters, the observation could

    take place in different periods. Each rater observed the same teacher in the same

    class.

    The data from the pretest were encoded and analyzed for reliability and

    validity. Acceptable items were determined using the Polytomous Rasch Model

    (Rating Scale Analysis) by assessing item fit (Andrich, 1978). The approach is a

    probabilistic measurement model for sequential integer scores such as a Lickertscale. The WINSTEPS software was used to generate results of the Polytomous

    Rasch Model. The PARF criteria with inadequate fit were revised.

    ResultsThe data with N=183 teachers were used to analyze the reliability and validity

    of the PARF. Each ratee was rated by a primary and secondary rater. Missing values

    in the data were treated using mean replacement and the descriptive statistics that

    includes means, standard deviations, skewness, and kurtosis were obtained. The

    reliability was also obtained using Cronbachs alpha. Convergent validity of the

    rating scale was established by correlating the factor scores for each rater andbetween the two raters. The Polytomous Rasch Model was used to investigate the

    step calibration of the scale and fit of the items. The factor structure of the

    theoretical model was tested using Confirmatory Factor Analysis (CFA). Parceled

    solutions resulted in less bias in estimates of structural parameters under normal

    distributions than did solutions based on the individual items (Bandalos, 2002).

    The means (M=3.40, M=3.41) for the rating given by the primary and

    secondary raters are high given the highest possible score of 4.0. The means

    provided by both raters are almost the same indicating that the ratings were very

    consistent. The distribution of scores tends to be negatively skewed with peak

    modes. This is consistent with the high values of the means where majority of the

    ratings were between 3 and 4, and very few gave a rating of 1.00.The overall internal consistency of the scores, using Cronbachs alpha, for

    both primary and secondary raters is .98 which indicates high reliability. For

    primary raters alone, the internal consistency is .98 and for second raters alone, the

    internal consistency is .97 which also indicates high reliability.

    When the ratings of the primary raters and secondary raters were tested for

    concordance, the results of the correlation was significant (=.47, p

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    7/17

    110The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    moderate level. This significant concordance also indicates the reliability of the

    scale across two external raters. This implies similarity in the understanding of the

    items and observations for the same teacher being rated.

    The means, standard deviations, and internal consistencies were broken

    down by the domains in the instrument and the results were still consistent across

    the primary and secondary raters. The mean rating were still high (M=3.45 toM=3.38). This shows that even across domains the ratings between the primary and

    secondary raters for one teacher was consistent. In the same way, the Cronbachs

    alpha for each domain had very high internal consistencies.

    The convergence of the domains was tested across the same rater and across

    the primary and secondary rater.

    Table 1

    Convergence of the Domains for Primary and Secondary Raters

    Secondary Rater

    Primary Rater 1 2 3 4 M SDCronbachs

    Alpha

    1. Planning and preparation --- .76**a .83**a .64**a 3.45 0.36 .94

    2. Classroom Environment .85**b --- .82**a .66**a 3.38 0.37 .93

    3. Instruction .88**b .87**b --- .67**a 3.38 0.37 .94

    4. Professional responsibility .76**b .73**b .79**b --- 3.40 0.38 .93

    M 3.45 3.38 3.39 3.39

    SD 0.32 0.33 0.32 0.33

    Cronbachs alpha .93 .92 .92 .91

    Note. a values represent correlations among the secondary raters, b values are

    correlations for the primary raters

    **p

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    8/17

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    9/17

    112The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    Item fit mean square (MNSQ) using WINSTEPS was computed to

    determine if the items under each domain have a unidimensional structure. MNSQ

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    10/17

    113The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    INFIT values within 1.2 and 0.8 are acceptable. High values of item MNSQ

    indicate a lack of construct homogeneity with other items in a scale, whereas low

    values indicate redundancy with other items (Linacre & Wright, 1998). Two

    Rasch analyses were conducted separately for each rating provided by the primary

    and secondary raters.

    For the primary rater, four items lacked construct homogeneity whichmeans that they are not measuring the same construct as the other factors. These

    items are about service to the school, participation in college wide-activities,

    enhancement of content-knowledge pedagogy, and service to the profession

    respectively. On the other hand, six items are redundant with other items. These

    items are about instructional materials, lesson and unit structure, quality of

    feedback, lesson adjustment, and student progress in learning respectively.

    For the secondary rater, eight items lacked construct homogeneity. These

    items are about student interaction, importance of content, student pride in work,

    quality of questions, engagement of families and student services, service to the

    school, participation in college-wide activities, and service to the profession. On the

    other hand, three items were redundant with other items. These items are aboutquality of feedback, timeliness of feedback, and lesson adjustment.

    A Confirmatory Factor Analysis (CFA) was conducted to examine the factor

    structure of the Danielsons Components of Professional Practice as a four-factor

    scale. The first model tested a four-factor structure with the indicators or manifest

    variables used were the actual items (the ratings of the primary and secondary raters

    for each item was averaged). There were 25 items for planning and preparation, 21

    items for the classroom environment, 22 items for instruction, and 20 items for

    professional responsibility. The result of the measurement model showed that the

    four factors are significantly related and all 88 indicators had significant paths to

    their respective factors. However the data did not fit the specific model, 2=8829.23,

    df=3734, PGI=.57, Bentler-Bonnett Normed Fit Index=.46, Bentler-Bonnett Non-Normed Fit Index=.56. A second measurement model was constructed retaining

    the four factors with few constraints. The constraints were reduced by having less

    parameter estimates in the model. This was done by creating three parcels as

    indicators to each factor. The parcels were created by combining item scores for

    both primary and secondary rater. Given few indicators per factor, the dfin the

    second analysis was reduced to 132 that yielded a larger statistical power and model

    fit. The results in the second analysis showed that all four factors are significantly

    correlated and each parcel is also significant. The fit of the model improved as

    compared with a measurement model with more constraints, 2=262.47, df=132,

    PGI=.86, Bentler-Bonnett Normed Fit Index=.89, Bentler-Bonnett Non-Normed

    Fit Index=.87. The results of the CFA showed that the four factors of DanielsonsComponents of Professional Practice is adequate and can be used.

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    11/17

    114The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    DiscussionA rating scale anchored on Danielsons Components of Professional

    Practice and learner-centered principles was constructed to rate teachers

    performance. The analysis involved statistics to determine the internal

    consistencies, convergence, item fit, and factor structure of the scale. These

    analyses somehow had favorable results regarding the validity and reliability of the

    scale.

    For the scales internal consistency, the obtained Cronbachs alpha was

    high, given the ratings provided by the primary and secondary raters for the whole

    scale and for each factor. Internal consistency of the items was achieved in a similar

    fashion for the two raters. The items indicate that the scale is measuring the same

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    12/17

    115The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    overall construct. When the internal consistency of the items were computed for

    each domain, high Cronbachs alpha were also obtained. Even if the items were

    reduced, as in the case of each factor, high internal consistency was still achieved.

    When the primary and secondary raters were tested if they concord on the

    same observation, a significant coefficient was obtained (=.47). There is

    consistency of ratings across two separate raters. This consistency reflects a

    common understanding of the items meaning and observation of the teacher being

    observed. This is a good indicator for future use of the test considering that the

    actual implementation involves two or even multiple raters. These two raters need

    to concord with their ratings of the same teacher to achieve a more consistent result

    of the teachers performance. This concordance is facilitated by the items where

    each rater had a common understanding and frame of assessment for the teacher

    being observed. When the concordance analysis was conducted for each domain,

    significant relationships occurred for the four factors across the two raters. The two

    raters do not only have consistent understanding and reference of observation for

    the whole scale, the same consistency is carried for each factor.

    The scale also showed convergence across the domains for each rater. Theresults show significant correlations of all the four factors in the case of the primary

    and secondary raters. The same pattern of correlation was also achieved for the

    primary and secondary raters. The pattern of correlations showed that domains

    planning and preparation, the classroom environment, and instruction were highly

    correlated. Even if all four factors were significantly correlated, the correlation

    coefficients for professional responsibility with the other factors are not as high as

    compared with the coefficients for the first three. The same pattern of correlations

    is true for both the primary and secondary raters. This shows that professional

    responsibility is not seen as highly linked to teaching as compared to the first three

    domains (planning and preparation, classroom environment, and instruction). The

    raters and teachers do not seem to consider much the professional responsibility tobe integrated strongly with classroom performance or its translation into the actual

    teaching process as compared to the kind of integration in the first three domains.

    The item analysis using the Polytomous Rasch Model showed that the items

    on student interaction, importance of content, student pride in work, quality of

    questions, engagement of families and student services, service to the school,

    participation in college wide-activities, enhancement of content-knowledge

    pedagogy, and service to the profession are out of bounds as compared to other

    items. These items did not seem applicable for majority of the teachers. There was

    agreement between the primary and secondary raters on this misfit especially on

    three items (participation in college wide-activities, enhancement of content-

    knowledge pedagogy, service to the profession). This was consistent in theconvergence of the domains. Given these three items, the raters and teachers has a

    tendency to view a weak integration of participating in college activities, attending

    seminars, and publication as part of their teaching performance or their role to

    improve ones teaching (items of professional responsibility).

    The item analysis using the Polytomous Rasch Model also showed that the

    items on instructional materials, lesson and unit structure, quality of feedback,

    timeliness of feedback, lesson adjustment, student progress in learning respectively

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    13/17

    116The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    are redundant with the other items. There was agreement between primary and

    secondary raters on quality of feedback and lesson adjustment. These items were

    rated more likely in the same way as the other items. These items were carefully

    reviewed again by the faculty and agreed to remove them from the pool of items.

    The adequacy of the model composed of a four factor structure was proven

    in the study. This shows that the four factors (planning and preparation, classroomenvironment, instruction, and professional responsibility) can be used as essential

    components in assessing teacher performace in the higher education. This shows

    that the scale measures effectively four distinct domains. Previous studies using

    Danielsons components of professional practice were widely applied for teachers

    teaching in the elementary and high school. However, the present study showed the

    appropriateness of the domains even for higher education institutions.

    The results of the present study points to three perspectives on assessing

    teacher performance: (1) The need to inculcate professional responsibility such as

    research and continuing education programs for higher education faculty, (2) the

    advantage of the instrument having multiple raters, and (3) expectations that needs

    to be set for higher education institutions in the Philippines.Professional responsibility is an important part of higher education faculties

    work requirements. However, the study found that service to the profession such as

    research and publications, participation to school activities, and enhancement of

    pedagogy were less integrated with instruction among teachers in higher education.

    This scenario is typical in most higher education institutions where the teachers

    work is concentrated on teaching, whereas the professional responsibility is

    underrated. Once a teacher is hired in a higher education institution, the teacher is

    defined on how much teaching load is given and much expectation is placed on

    teaching. The entire semester of the teacher is devoted on teaching and no time for

    professional responsibility is provide such as engagement in research, looking for

    publication opportunities, and attending contributing professional development. Ascompared to other countries, universities and colleges balance both teaching and

    research (Calma, 2009; Magno, 2009a). Colleges and universities in the Philippines

    have limited opportunities and resources given for a faculty to conduct research and

    establish their own research laboratories and facilities. This is reflective of the very

    few universities entering and very low status of universities in the world university

    rankings by the Times Higher Education (Magno, 2009a). For other professional

    and pedagogical enhancements, the selection is very limited and the funds provided

    are very minimal for a faculty to attend conferences within and outside of the

    country. The same scenario is true for teachers in the grade school and high school.

    Much of the rewards are for teaching and not on certain professional responsibility

    such as research, publications, and involvement in professional organizations.The strength of the Peer Assistance Review Form developed in the study

    rests on the consistency obtained through multiple raters and scale calibration

    procedure. The raters were consistent in their interpretations, ratings, and

    calibration of the scales. The calibration of the scale from lowest to highest in terms

    of its degree is one aspect of scale fidelity that most test developers neglect to report

    (Magno, 2009b). This procedure can accurately be estimated using a Polytomous

    Rasch Model. A new perspective for rating scales is not only to establish its internal

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    14/17

    117The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    consistencies and factorial structure, it is also important to determine and report its

    scale calibration. The category structure allows scale developers to decide on the

    appropriateness of the scale length and the type of scale used. Another advantage

    that led to the results is the refined description of the scale framed in an analytical

    rubric format (Reddy & Andrade, 2010). The raters can easily and elaborately

    distinguish among skills presented in each global criterion. This ensures theappropriate gradation of the scale.

    Lastly is the need to look further at the standards and competencies for

    higher education teachers. This issue is addressed in the study by testing specific

    competencies required of higher education teachers. These standards of

    competencies need to be set to ensure that students are benefitting through

    instruction (Berdrow & Evers, 2009). Colleges and universities need to adhere to

    teaching and learning frameworks that will serve and carry out their mission and

    vision well. Very few universities in the Philippines adhere to specific teaching and

    learning thrusts which led to poor educational standards (Magno, 2009a). In the

    Philippine setting, the competencies of teachers in the basic education are

    specified. However, this move is also not impossible because of the rich tradition ofliterature for higher education. The present study attempted to frame these

    competencies using an amalgamated framework of the learner-centered practices

    and Danielsons components of professional practice (see Magno & Sembrano,

    2009). This study pioneers the setting of specific teaching and learning frameworks

    for faculty in the Philippines.

    The move on assessing teacher performance rigorously needs to be

    advocated in Philippine higher education institutions to ensure accountability of

    graduates and quality of faculty. Assessing teacher performance also needs to take a

    developmental process where results should be used to help teachers reach

    specified expectations (Bruce & Ross, 2009; Reid, 2008; Bernstein & Bass, 2000;

    Blackmore, 2005; Yon, Burnap, & Kohut, 2002; Kumrow & Dahlem, 2002). Thismove is carried out by having good instrument to facilitate these benefits. The use

    of assessment instruments for rating teachers should coincide with practices that will

    also help teachers improve their teaching.

    Having established avalid and reliable scale for teachers performance means

    that proper and appropriate assessment tool can be used. Rigorous assessment of

    teacher performance is known to occur in the basic education (grade school and

    high school teacher) in the Philippine setting. There is very limited advocacy in

    maintaining the move for teacher performance assessment and measures in the

    Philippines higher education institutions because of the complexity of its structure

    (involvement in research and professional development). However, the present

    study pushed these frontiers first by providing an instrument evidenced to beappropriate and implemented the possibility of proper assessment practices among

    higher education faculty.

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    15/17

    118The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    ReferencesAllison-Jones, L. L., & Hirt, J. B. (2004). Comparing the teaching effectiveness of

    part-time and full-time clinical nurse faculty. Nursing Education

    Perspectives, 25, 238-242.

    Andrich, D. (1978). A rating formulation for ordered response categories.Psychometrika, 43, 561-73.

    Anonymous. (2006). Standards-based teacher evaluations. Gifted Child Today, 29,

    8-9.

    Atwood, C. H., Taylor, J. W., & Hutchings, P. A. 2000. Why are chemists and

    other scientists afraid of the peer review of teaching? Journal of Chemical

    Education, 77, 239-244.

    Bandalos, D. L. (2002). The effects of item parceling on goodness-of-fit and

    parameter estimate bias in structural equation modeling. Structural

    Equation Modeling, 9, 78-102.

    Berdrow, I., & Evers, F. T. (2009). Bases of competence: an instrument for self and

    institutional assessment. Assessment and Evaluation in Higher Education,35, 419-434.

    Bernstein, D., & Bass, R. 2005. The scholarship of teaching and learning.

    Academe, 91, 37-44.

    Blackmore, J. A. (2005). A critical evaluation of peer review via teaching

    observation within higher education. The International Journal of

    Educational Management, 19, 215-320.

    Bruce, C. D., & Ross, J. A. (2008). A model for increasing reform implementation

    and teacher efficacy: Teacher peer coaching in grades 3 and 6 mathematics.

    Canadian Journal of Education,31, 346-370.

    Calma, A. (2010). The context of research training in the Philippines: Some key

    areas and their implications. The Asia-Pacific Education Researcher, 18,167-184.

    Carter, V. K. (2008). Five steps to become a better peer reviewer. College

    Teaching,56, 85-90.

    Centra, J. A. (1998). The development of the student instructional report II.

    Princeton, New Jersey: Educational Testing Service.

    Danielson, C. (1996). Enhancing professional practice: A framework for teaching.

    Alexandria, VA: Association for Supervision and Curriculum Development.

    Goldstein, J. (2003). Making sense of distributed leadership: The case of peer

    assistance and review. Educational Evaluation and Policy Analysis, 25, 397-

    421.

    Goldstein, J. (2004). Making sense of distributed leadership: The case of peerassistance and review. Educational Evaluation and Policy Analysis,26, 173-

    197.

    Gosling, D. (2002). Models of peer observation of teaching. LTSN Generic Centre.

    Graves, G., Sulewski, C. A., Dye, H. A., Deveans, T. M., Agras, N. M., & Pearson,

    J. M. (2009). How are you doing? Assessing effectiveness in teaching

    mathematics. Primus, 19, 174-193.

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    16/17

    119The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    Heckert, T. M., Latier, A.., Ringwald, A., & Silvey, B. (2006). Relation of course,

    instructor, and student characteristics to dimensions. College Student

    Journal, 40, 1-11.

    Howard, F. J., Helms, M. M., & Lawrence, E. P. (1997). Development and

    assessment of effective teaching: an integrative model for implementation in

    schools of business administration. Quality Assurance in Education,5, 159-161.

    Keig, L. (2000). Formative peer review of teaching: Attitudes of faculty at liberal arts

    colleges toward colleague assessment. Journal of Personnel Evaluation in

    Education,14, 67-87.

    Kell, C., & Annetts, S. (2009). Peer review of teaching embedded practice or policy-

    holding complacency? Innovations in Education and Teaching

    International,46, 61-70.

    Kerchner, C. T., & Koppich, J. E. (1993).A union of professionals: Labor relations

    and education re-form. New York: Teachers College Press.

    Kumrow, D., & Dahlem, B. (2002). Is peer review an effective approach for

    evaluating teachers? The Clearing House,75, 236-240.Linacre, J. M., & Wright, B. D. (1998). A user's guide to Winsteps, Bigsteps, and

    Ministeps: Rasch-model computer programs. Chicago: MESA Press.

    Louis, K. S., & Marks, H. M. (1998). Does professional community affect the

    classroom? Teachers work and student experience in restructuring schools.

    American Journal of Education,106, 532-575.

    Magno, C. (2009a). A metaevaluation study on the assessment of teacher

    performance in an assessment center in the Philippines. The International

    Journal of Educational and Psychological Assessment, 3, 75-93.

    Magno, C. (2009b). Demonstrating the difference between classical test theory and

    item response theory using derived test data. The International Journal of

    Educational and Psychological Assessment, 1, 1-11.Magno, C., & Sembrano, J. (2007). The Role of teacher efficacy and characteristics

    on teaching effectiveness, performance, and use of learner-centered

    practices. The Asia-Pacific Education Researcher, 16, 73-91.

    Magno, C., & Sembrano, J. (2010). Integrating learner-centeredness and teaching

    performance in a theoretical model. International Journal of Teaching and

    Learning in Higher Education, 21(2), 158-170.

    Marsh, H. W., & Bailey, M. (1993). Multidimensional students' evaluations of

    teaching effectiveness. The Journal of Higher Education, 64, 1-18.

    McCombs, B. L. (1997). Self-assessment and reflection: Tools for promoting

    teacher changes toward learner-centered practices. NASSP Bulletin,81, 1-

    14.McLymont, E. F., & da Costa, J. L. (1998, April). Cognitive coaching the vehicle for

    professional development and teacher collaboration. Paper presented at the

    annual meeting of the American Educational Research Association, San

    Diego, CA.

    Oakland, T., & Humbleton, R. K. (2006). International perspectives on academic

    assessment. New York: Springer.

  • 7/28/2019 Assessing Higher Education Teachers through Peer Assistance and Review

    17/17

    120The International Journal of Educational and Psychological AssessmentJanuary 2012, Vol. 9(2)

    2012 Time Taylor Academic JournalsISSN 2094-0734

    Pike, C. K. (1998). A validation study of an instrument designed to measure

    teaching effectiveness.Journal of Social Work Education,34, 261-272.

    Reddy, Y. M., & Andrade, H. (2010). A review of rubric use in higher education.

    Assessment and Evaluation in Higher Education,35, 435-448.

    Reid, E. S. (2008). Mentoring peer mentors: Mentor education and support in the

    composition program. Composition Studies,36, 1-31.Ross, J. A., McDougall, D., & Hogaboam-Gray, A. (2002). Research on reform in

    mathematics education, 1993-2000. Alberta Journal of Educational

    Research,48, 122-138.

    Scriven, M. (1994). Duties of as teacher. Journal of Personnel Evaluation in

    Education, 8, 151-184.

    Stiggins, R. (2008). Assessment for learning, the achievement gap, and truly

    effective schools. Portland, OR: ETS Assessment Training Institute.

    Stolle, C., Goerss, B., & Watkins, M. (2005). Implementing portfolios in a teacher

    education program. Issues in Teacher Education,14, 25-34.

    Stringer, M., & Irwing, P. (1998). Students' evaluations of teaching effectiveness: A

    structural modelling approach. British Journal of Educational Psychology,68, 409-511.

    Tang, L. T. (1997). Teaching evaluation at a public institution of higher education:

    Factors related to the overall teaching effectiveness. Public Personnel

    Management,26, 379-380.

    Wen, M. L., Tsai, C., & Chang, C. (2006). Attitudes towards peer assessment: a

    comparison of the perspectives of pre-service and in-service teachers.

    Innovations in Education and Teaching International, 43, 83-93.

    Wray, S. (2008). Swimming upstream: Shifting the purpose of the an existing

    teaching portfolio requirement. Professional Educator, 32, 1-17.

    Yon, M., Burnap, C., & Kohut, G. 2002. Evidence of effective teaching:

    Perceptions of peer reviewers. College Teaching, 50, 104-111.Young, S., & Shaw, D. G. 1999. Profiles of effective college and university teachers.

    The Journal of Higher Education,70, 670-687.

    About the AuthorDr. Carlo Magno is presently a faculty of the Counseling and Educational

    Psychology Department at De La Salle University, Manila. Most of his research

    focuses on the development of different forms of teacher assessment protocols. He

    is involved with several projects that involve assessment of teacher competencies inthe Philippines. Further correspondence can be addressed to him at the College of

    Education, De La Salle University, 2401 Taft Ave., Manila, Philippines, e-mail:

    [email protected].