Teacher Quality, Quality Teaching, and Student Outcomes: Measuring the Relationships Heather C. Hill...

33
Teacher Quality, Quality Teaching, and Student Outcomes: Measuring the Relationships Heather C. Hill Deborah Ball, Hyman Bass, MerrieBlunk, Katie Brach, CharalambosCharalambous, Carolyn Dean, Séan Delaney, Imani Masters Goffney, Jennifer Lewis, Geoffrey Phelps, Laurie Sleep, Mark Thames, Deborah Zopf

Transcript of Teacher Quality, Quality Teaching, and Student Outcomes: Measuring the Relationships Heather C. Hill...

Teacher Quality, Quality Teaching, and Student Outcomes: Measuring the Relationships

Heather C. HillDeborah Ball, Hyman Bass, MerrieBlunk, Katie Brach,

CharalambosCharalambous, Carolyn Dean, Séan Delaney, Imani Masters Goffney, Jennifer Lewis, Geoffrey Phelps,

Laurie Sleep, Mark Thames, Deborah Zopf

Measuring teachers and teaching

Traditionally done at entry to profession (e.g., PRAXIS) and later ‘informally’ by principals

Increasing push to measure teachers and teaching for specific purposes: Paying bonuses to high-performing teachers Letting go of under-performing (pre-tenure)

teachers Identifying specific teachers for professional

development Identifying instructional leaders, coaches, etc.

Methods for identification

Value-added scores Average of teachers’ students’ performance this

year differenced from same group of students’ performance last year

In a super-fancy statistical model Typically used for pay-for-performance schemes Problems

Self-report / teacher-initiated Typically used for leadership positions, professional

dev. However, poor correlation with mathematical

knowledge R= 0.25

Identification: Alternative Methods

Teacher characteristics NCLB’s definition of “highly qualified” More direct measures

Educational production function literature

Direct measures of instruction CLASS (UVA)—general pedagogy Danielson, Saphier, TFA—ditto But what about mathematics-specific

practices?

Purpose of talk

To discuss two related efforts at measuring mathematics teachers and mathematics instruction

To highlight the potential uses of these instruments Research Policy?

Begin With Practice

Clips from two lessons on the same content – subtracting integers What do you notice about the instruction in

each mathematics classroom? How would you develop a rubric for

capturing differences in the instruction? What kind of knowledge would a teacher

need to deliver this instruction? How would you measure that knowledge?

Bianca

Teaching material for the first time (Connected Mathematics)

Began day by solving 5-7 with chips Red chips are a negative unit; blue

chips are positive Now moved to 5 – (-7) Set up problem, asked students to used

chips Given student work time

Question

What seems mathematically salient about this instruction?

What mathematical knowledge is needed to support this instruction?

Mercedes

Early in teaching career Also working on integer subtraction with

chips from CMP Mercedes started this lesson previous

day, returns to it again

Find the missing part for this chip problem. What would be a number sentence for this problem?

Start With Rule End With

Add 5

Subtract 3

Questions

What seems salient about this instruction? What mathematical knowledge is needed

to support this instruction?

What is the same about the instruction? Both teachers can correctly solve the

problems with chips Both teachers have well-controlled

classrooms Both teachers ask students to think

about problem and try to solve it for themselves

What is different?

Mathematical knowledge Instruction

Observing practice…

Led to the genesis of “mathematical knowledge for teaching”

Led to “mathematical quality of instruction”

Mathematical Knowledge for Teaching

Source: Ball, Thames & Phelps, JTE 2008

MKT Items

2001-2008 created an item bank of for K-8 mathematics in specific areas (see www.sitemaker.umich.edu/lmt) (Thanks NSF) About 300 items

Items mainly capture subject matter knowledge side of the egg

Provide items to field to measure professional growth of teachers NOT for hiring, merit pay, etc.

MKT Findings Cognitive validation, face validity, content validity Have successfully shown growth as a result of prof’l

development Connections to student achievement - SII

Questionnaire consisting of 30 items (scale reliability .88)

Model: Student Terra Nova gains predicted by: Student descriptors (family SES, absence rate) Teacher characteristics (math methods/content, content knowledge)

Teacher MKT significant Small effect (< 1/10 standard deviation): 2 - 3 weeks of instruction But student SES is also about the same size effect on achievement

(Hill, Rowan, and Ball, AERJ, 2005)

What’s connection to mathematical quality of instruction??

History of Mathematical Quality of Instruction (MQI) Originally designed to validate our

mathematical knowledge for teaching (MKT) assessments Initial focus: How is teachers’ mathematical

knowledge visible in classroom instruction? Transitioning to: What constitutes quality in

mathematics instruction? Disciplinary focus Two-year initial development cycle (2003-05) Two versions since then

MQI: Sample Domains and Codes

Richness of the mathematics e.g., Presence of multiple (linked) representations,

explanation, justification, multiple solution methods

Mathematical errors or imprecisions e.g., Computational, misstatement of mathematical

ideas, lack of clarity

Responding to students e.g., Able to understand unusual student-generated

solution methods; noting and building upon students’ mathematical contributions

Cognitive level of student work Mode of instruction

Initial study: Elementary validation

Questions: Do higher MKT scores correspond with

higher-quality mathematics in instruction?

NOT about “reform” vs. “traditional” instruction

Instead, interested in the mathematics that appears

Method

10 K-6 teachers took our MKT survey Videotaped 9 lessons per teacher

3 lessons each in May, October, May

Associated post-lesson interviews, clinical interviews, general interviews

Elementary validation study

Coded tapes blind to teacher MKT score Coded at each code

Every 5 minutes Two coders per tape

Also generated an “overall” code for each lesson – low, medium, high knowledge use in teaching

Also ranked teachers prior to uncovering MKT scores

Projected Versus Actual Rankings of Teachers

Projected ranking of teachers:

Actual ranking of teachers (using MKT scores):

Correlation of .79 (p < .01)

Hill, H.C. et al., (2008) Cognition and Instruction

Correlations of Video CodeConstructs to Teacher Survey Scores

Construct (Scale)Correlation to MKT scores

Responds to students 0.65*

Errors total -0.83*

Richness of mathematics 0.53

*sig

nifi

can

t at

the .

05

leve

l

Validation Study II: Middle School

Recruited 4 schools by value-added scores High (2), Medium, Low

Recruited every math teacher in the school All but two participated for a total of 24

Data collection Student scores (“value-added”) Teacher MKT/survey Interviews Six classroom observations

Four required to generalize MQI; used 6 to be sure

Validation study II: Coding

Revised instrument contained many of same constructs Rich mathematics Errors Responding to students

Lesson-based guess at MKT for each lesson (averaged)

Overall MQI for each lesson (averaged to teacher) G-study reliability: 0.90

Validation Study II:Value-added scores All district middle school teachers

(n=222) used model with random teacher effects, no school effects Thus teachers are normed vis-à-vis

performance of the average student in the district

Scores analogous to ranks Ran additional models; similar results* Our study teachers’ value-added scores

extracted from this larger dataset

Results

MKT MQI Lesson-based MKT

Value-added score*

MKT 1.0 0.53** 0.72** 0.41*

MQI 1.0 0.85** 0.45*

Lesson-based MKT

1.0 0.66**

Value added score

1.0

•Significant at p<.05•Significant at p<.01

Source: Hill, H.C., Umland, K. &Kapitula, L. (in progress) Validating Value-Added Scores: A Comparison with Characteristics of Instruction. Harvard GSE: Authors.

Additional Value-Added Notes

Value-added and average of: Connecting classroom work to math: 0.23 Student cognitive demand: 0.20 Errors and mathematical imprecision: -0.70** Richness: 0.37*

**As you add covariates to the model, most associations decrease Probably result of nesting of teachers within

schools Our results show a very large amount of

“error” in value-added scores

Lesson-based MKT vs. VAM score

Proposed Uses of Instrument

Research Determine which factors associate with student

outcomes Correlate with other instruments (PRAXIS,

Danielson) Instrument included as part of the National Center

for Teacher Effectiveness, Math Solutions DRK-12 and Gates value-added studies (3)

Practice?? Pre-tenure reviews, rewards Putting best teachers in front of most at-risk kids Self or peer observation, professional development

Problems

Instrument still under construction and not finalized

G-study with master coders indicates we could agree more among ourselves

Training only done twice, with excellent/needs work results

Even with strong correlations, significant amount of “error”

Standards required for any non-research use are highKEY: Not yet a teacher evaluation tool

Next

Constructing grade 4-5 student assessment to go with MKT items

Keep an eye on use and its complications

Questions?