Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist...

44
Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson

Transcript of Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist...

Page 1: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Growth, Value-Added and Teacher Effectiveness Measures

Philip R. FletcherSenior Research ScientistPearson

Page 2: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Teacher opinion

A recent international survey of teachers shows:

--That the vast majority of teachers welcome appraisal and feedback on their work.

--That it improves their job satisfaction and effectiveness as teachers.

--But too many teachers do not receive any feedback on their work at all.

--Moreover, evaluation is perceived to be an instrument of compliance rather than development.

Page 3: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Teacher ratings

Most school districts use pass-fail ratings where nearly all teachers pass. 99% of teachers in districts using binary ratings are rated satisfactory. 94% of teachers in districts using multiple points are in the top two categories. As Arne Duncan noted, “Ninety-nine percent of our teachers are above average.”

Page 4: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Teacher salaries

Teacher compensation is very predictable.

Based on the teacher’s highest degree and years of seniority.

Almost completely unrelated to variations in teacher effectiveness.

Page 5: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Effectiveness varies

Anecdotal and empirical evidence suggests that teachers differ dramatically in effectiveness. An effective teacher will raise student test scores by ten percentiles per year. Three years of effectively teachers raise test scores by thirty percentiles.

Traditional teacher evaluation systems fail to recognize these differences. 

Page 6: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Teacher recognition

The need to recognize teachers who make magnificent contributions to student learning. The need to motivate people to gain expertise.

And the need to leverage expert teachers and reward them for their efforts.

To ensure that students are taught successfully, there is need to differentiate teacher effectiveness in terms of their impact on student learning.

Page 7: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Status, growth and effectiveness

Student achievement is the status of accumulated subject matter knowledge at one point in time—a lagging indicator.

Student learning is growth in subject matter knowledge over time—a leading indicator.

It is student learning—not student achievement—that is most relevant in defining and assessing teaching effectiveness.

Page 8: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Status, growth and effectiveness

Achievement provides evidence of the status of student knowledge and understanding at one point in time.

Learning is demonstrated by growth in student achievement from one point in time to another point in time–not by status at either point time alone.

Effectiveness is demonstrated by above-average student learning and growth.

Page 9: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Status, growth and effectiveness

Schematically:

Status = Achievement

Growth = Learning

Relative Growth = Effectiveness

Page 10: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Status and growth

Page 11: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Relative growth and effectiveness

Page 12: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Why growth?

Growth reflects learning, and we care about student learning.

Because the principle role of teachers is to enhance student learning.

Teacher effectiveness should be reflected in how much their students learn. 

Page 13: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Official incentives

Teacher Incentive Fund (TIF) grants require school districts to evaluate teachers. Race to the Top (RttT) funds require a state commitment to measuring teacher effectiveness. No Child Left Behind (NCLB) required testing of all students in reading in mathematics, leading to the development of longitudinal data systems linked to individual teachers.

Page 14: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Student testing

Most states have test data linked to specific schools and teachers that can be used to track student growth.

Many assessment systems are based on student test score growth over time:

Value-added models

Student growth percentiles

Both address effectiveness in terms of learning rather than status.

Page 15: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Value-added models are designed to assess school and teacher contributions to student growth.

A value-added assessment model is designed to demonstrate the impact of individual schools and teachers.

It is designed to distinguish between teacher effects and other outside influences.

Page 16: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Value-added captures the growth that classes of students achieve during a single year of schooling.

To estimate classroom effects, student data include only the students enrolled in a particular class.

Page 17: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Key idea is to statistically isolate the contribution of individual teachers from all other sources of influence.

Value-added analyses attempt to determine the amount of student growth that can be attributed to an individual teacher.

Value-added models quantify teacher effectiveness—the teacher’s contribution to student learning and growth.

Page 18: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Value-added attributes causality to the teacher.

Teachers are responsible for the learning and growth of their students.

Under conditions of high stakes accountability, student growth has been directed toward cause and responsibility.

Page 19: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Some statisticians would argue that value-added unsuited for drawing causal inferences that a given teacher is responsible for the increase in student test scores.

“We do not think that their analyses are estimating causal quantities, except under extreme and unrealistic assumptions.” –Rubin, Stuart, and Zanutto (2004).

“…it does not appear possible to separate teacher and school effects using currently available accountability data.” –Raudenbush (2004).

Page 20: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added assessment

Policymakers and school administrators generally express no such reservations and offer strong support for the value-added.

“If quality instruction is essential for student learning, then student learning should tell us something about the quality of instruction.”

Page 21: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Descriptive accountability

Accountability system results may have value without making causal inferences.

From this perspective, accountability results should not be used to sanction teachers in schools.

Instead, they should be used to make sound judgments about quality and needed improvements.

Descriptive information and identification of schools, teachers, and students that may require further attention.

Page 22: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Describing student growth

The Colorado Growth Model was designed to describe student growth and learning.

Quantile regression is used to model the complete distribution of student achievement over time.

The model quantifies distance = growth rate time, probabilistically.

Growth percentiles describe the rarity of a student’s current growth, given their prior achievement.

Page 23: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Student growth percentiles

Page 24: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Student growth percentiles

Examining growth with achievement sheds new light on school performance.

Median growth above the 50th percentile identifies best practices and sources that can offer support.

Median growth below the 50th percentile identifies greatest needs and targets that need to receive support.

A gap-closing strategy is built around a consensus of school improvement.

Page 25: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Student growth percentiles

Page 26: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Common yardstick

Most states have administrative data that can be used as a common yardstick to identify the 25% most effective teachers. Supervisor ratings and classroom observations provide no such common yardstick. Local implementation of these other measures varies in 1600 school districts nationwide. More importantly, they do not directly reflect student learning. 

Page 27: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added and growth limitations

Value-added and growth percentiles are only available for teachers in certain subject matter areas.

Value-added and growth percentiles are available for only a small subset of teachers.

Value-added and growth percentiles are limited by the test.

Growth metrics are too narrow to provide information about how teachers can improve.

Page 28: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added and growth shortcomings

Value-added metrics and growth percentiles for individual teachers fluctuate from year to year. They can be influenced by factors beyond the teacher’s control. They are imperfect measures with a relatively large error component.

Page 29: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Concern

How well does value-added predict the top 25% from year-to-year?

How well do alternative measures of teacher effectiveness predict the same top 25% from year to year?

Classroom observations?

Principals’ ratings?

Student surveys?  

Page 30: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added and growth compare favorablyValue-added metrics and growth percentiles compare favorably with performance measures in other fields. The correlation between SAT test scores and freshman success in college is 0.35. The correlation in batting averages between years in professional baseball is 0.36. The correlation between value-added estimates this year and next lies between 0.20 and 0.60. While most value-added estimates correlate 0.30 and 0.40 between years. 

 

Page 31: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Value-added and growth prognosis

Recommend the use of value-added measures and growth percentiles, principally because they are related to student learning and growth. Are mindful of their limitations and imperfections. Strive to continually improve these growth measures.

Page 32: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Suggestion

Use multiple measures—not only value-added metrics and growth percentiles. Alternate measures should meaningfully supplement state test score data and increase prediction. Alternate measures should be applicable to a broader range of teachers. Provide direct information and feedback suggesting how teachers can improve teaching.

Page 33: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Suggestion

Use core and non-core measures to validate the full range of teacher effectiveness for a broader range of teachers. Where growth measures benchmark the reliability of other teacher effectiveness measures. Key idea is to predict benchmark growth measures.

Weight different measures based on their power to predict student learning and growth.

Page 34: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Observational measures

What is needed is not so much an accounting of teacher time or a rating of teacher performance, but rather higher level inferences about the teacher’s ultimate purposes and effects.

Making holistic judgments requires higher levels of inference.

In short, we need a method to obtain holistic rankings reliably and validly.

Procedures must minimize rater effects and coding errors.

Page 35: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Classroom Interactions

A complex situation, difficult to characterize unassisted.

Teacher practice and student-teacher interactions—from the participants’ point of view.

How do students and teachers interact in a practical and personal sort of way?

How do they approach and solve problems together?

Are there different classroom profiles?

Page 36: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Concourse of meaning

The first challenge is to figure out what makes great teaching.

This is difficult and controversial from an educational perspective.

Yet relatively straightforward from a managerial perspective.

Find the best educators and give them an opportunity to debate and create the best pedagogy and teaching practice.

Page 37: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

Charlotte Danielson’s Framework serves as a source of statements about teacher effectiveness.

The Framework is divided into:

--4 Domains

--23 Components

--76 Elements

--304 Items

Page 38: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

The 4 Domains include:

--Planning and Preparation

--The Classroom Environment

--Instruction

--Professional Responsibilities

Page 39: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

The 2 Domains that students actually see:

--The Classroom Environment

--Instruction

Page 40: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

Scoring rubrics:

Danielson New York StateUnsatisfactory IneffectiveBasic DevelopingProficient EffectiveDistinguished Highly Effective

Page 41: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

Items:

Rubric Item

Unsatisfactory Students not working with the teacher are disruptive to the class.

Basic Small groups are only partially engaged while not working directly with the teacher.

Proficient The students are productively engaged during small group work.

DistinguishedStudents take the initiative with their classmates to ensure that their time is used productively.

Page 42: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

The Danielson Framework is prescriptive.

Unsatisfactory and basic performance are often just the negation of proficient and distinguished performance.

No guide to what teachers do when under stress.

Good behavior follows rules. Lacks insight from control theory and negative feedback.

“Students help set high standards.”

Page 43: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Danielson Framework

A good basis for a limited number of items.

These items can be readily supplemented with items from other sources, by other authors.

Use these sources and create new items to fully cover what students and teachers actually do.

Page 44: Growth, Value-Added and Teacher Effectiveness Measures Philip R. Fletcher Senior Research Scientist Pearson.

Growth, value-added and teacher effectiveness measures

Features Student Growth Value-AddedTeacher-

EffectivenessFocus Student Teacher/Educator Teacher/Educator

Questionsaddressed

1. How much did this student grow?

2. Is the student on track?

1. How does teacher-classroom growth compare to expected growth?

2. How does teacher-classroom growth compare to that of other teacher-classrooms?

To what extent is the teacher/educator effective?

Input variables Student scores only 1. Student scores and their characteristics

2. Teacher characteristics

1. Multiple measures2. Multiple methods

Output 1. Student achievement percentile

2. Student growth percentile

1. Teacher value-added metric2. Teacher growth percentile

1. Effectiveness scores on individual measures

2. Composite score on multiple measures

3. Predicted comparable value-added metric

4. Predicted comparable growth percentile