School accountability and the black–white test score gap

17

Click here to load reader

Transcript of School accountability and the black–white test score gap

Page 1: School accountability and the black–white test score gap

Social Science Research 44 (2014) 15–31

Contents lists available at ScienceDirect

Social Science Research

journal homepage: www.elsevier .com/locate /ssresearch

School accountability and the black–white test score gap

0049-089X/$ - see front matter � 2013 Elsevier Inc. All rights reserved.http://dx.doi.org/10.1016/j.ssresearch.2013.10.008

⇑ Corresponding author at: Robert Wood Johnson Foundation Scholars in Health Policy Research, University of Michigan, United States.E-mail addresses: [email protected] (S.M. Gaddis), [email protected] (D.L. Lauen).

S. Michael Gaddis a,b,⇑, Douglas Lee Lauen c

a Robert Wood Johnson Foundation Scholars in Health Policy Research, University of Michigan, United Statesb Department of Sociology, The Pennsylvania State University, United Statesc Department of Public Policy, University of North Carolina at Chapel Hill, United States

a r t i c l e i n f o a b s t r a c t

Article history:Received 26 June 2012Revised 23 August 2013Accepted 23 October 2013Available online 5 November 2013

Keywords:Educational inequalityAcademic achievementAchievement gapRacial inequalityNCLBSchool accountability

Since at least the 1960s, researchers have closely examined the respective roles of families,neighborhoods, and schools in producing the black–white achievement gap. Althoughmany researchers minimize the ability of schools to eliminate achievement gaps, the NoChild Left Behind Act (NCLB) increased pressure on schools to do so by 2014. In this study,we examine the effects of NCLB’s subgroup-specific accountability pressure on changes inblack–white math and reading test score gaps using a school-level panel dataset on allNorth Carolina public elementary and middle schools between 2001 and 2009. Using dif-ference-in-difference models with school fixed effects, we find that accountability pressurereduces black–white achievement gaps by raising mean black achievement without harm-ing mean white achievement. We find no differential effects of accountability pressurebased on the racial composition of schools, but schools with more affluent populationsare the most successful at reducing the black–white math achievement gap. Thus, our find-ings suggest that school-based interventions have the potential to close test score gaps, butdifferences in school composition and resources play a significant role in the ability ofschools to reduce racial inequality.

� 2013 Elsevier Inc. All rights reserved.

1. Introduction

Since the Coleman report’s assessment of the importance of family versus school characteristics in academic outcomes(Coleman et al., 1966), much of the literature in the sociology of education has questioned schools’ ability to close achieve-ment gaps. Research showing large test score gaps at school entry (Lee and Burkam, 2002; Fryer and Levitt, 2004) and wid-ening gaps during summer months (Alexander et al., 2007; Downey et al., 2004; Entwisle and Alexander, 1992) suggest thatfamily inputs are more important than schooling. Nonetheless education policy often attempts to close achievement gapsand reduce inequality by focusing on within-school processes and policies, such as school accountability, rather than be-tween-school processes and policies, such as redistribution of resources. The No Child Left Behind Act (NCLB) was no excep-tion, with a goal to bring all children up to a minimum level of proficiency in math and reading using school accountabilitypressure.

Although racial achievement gaps narrowed throughout the 1960s and 1970s, most research finds that some gaps wid-ened or at least stagnated during the 1990s and early 2000s (Berends et al., 2008; Vanneman et al., 2009). Determining whatfactors contribute to achievement gaps and how they can be reduced remains a critically important issue in alleviating racialinequality, as researchers consistently show that black–white gaps in educational achievement and attainment contribute toracial inequalities in health and mortality (Hayward et al., 2000; Pampel, 2009; Williams and Collins, 1995), employmentand wages (Cancio et al., 1996; Kim, 2010; Grodsky and Pager, 2001), political participation (Logan et al., 2012), and crimeand incarceration (Pettit and Western, 2004; Phillips, 2002; Wakefield and Uggen, 2010), among other major life outcomes.

Page 2: School accountability and the black–white test score gap

16 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

NCLB represents a unique opportunity to examine whether educators within schools can respond to accountability pres-sure to reduce black–white achievement gaps and how the response may vary between schools with different racial and SEScharacteristics. Although many researchers remain unconvinced that schools can alter achievement gaps, no research hasexamined how the focus of education policy on within-school versus between-school processes and policies might informour understanding of schools’ success in reducing achievement gaps. Our research is the first to explore these issues by usinga difference-in-difference (DD) strategy that compares changes in schools under NCLB subgroup-specific accountability pres-sure to changes in schools not under pressure over two time periods. We focus on math and reading test score gaps in allpublic elementary and middle schools in North Carolina using longitudinal data from 2001 to 2009. First, we examine themain accountability effects on mean white and black test scores and the resulting achievement gaps within schools to deter-mine if any effects were due to changes in one or both racial groups’ scores. We then explore heterogeneous accountabilityeffects between schools based on student racial and poverty composition. Our results suggest that accountability pressurereduces black–white achievement gaps by raising mean black achievement without harming mean white achievement.Moreover, there are no differential effects of accountability pressure based on the racial composition of schools but schoolswith more affluent populations are the most successful at reducing the black–white math achievement gap.

2. Background and significance

2.1. Can schools narrow achievement gaps?

A long line of research in sociology and economics suggests that family characteristics are a major contributing factor toracial academic achievement gaps (Coleman et al., 1966; Duncan and Magnuson, 2005; Lee and Burkam, 2002; Roscigno,2000; Yeung and Conley, 2008). This position has gained strength from findings of racial language gaps prior to schooling(Kreisman, 2012) and racial academic achievement gaps at the beginning of kindergarten (Brooks-Gunn et al., 2003;Magnuson et al., 2004; Yeung and Pfeiffer, 2009). For example, analyses of nationally representative data from the EarlyChildhood Longitudinal Study (ECLS-K) suggest that nearly all of the black–white achievement gap in the fall of the kinder-garten year can be attributed to background characteristics (Fryer and Levitt, 2004; Lee and Burkam, 2002). The effects offamily background characteristics accumulate and compound over time (Potter and Roksa, 2013). Additionally, research sug-gests that both racial and socioeconomic achievement gaps widen when school is not in session (certain summer and wintermonths), further highlighting the importance of family factors (Alexander et al., 2001, 2007; Burkam et al., 2004; Downeyet al., 2004; Entwisle and Alexander, 1992, 1994; Heyns, 1978). Although these findings might indicate that schools at leastmaintain the status quo in terms of achievement gaps, other research suggests that black–white achievement gaps do notchange during the summer but instead change during the school year (Condron, 2009; Downey et al., 2004; Fryer and Levitt,2004).

To better understand black–white achievement gaps, scholars have explored the magnitude of the contributions of familybackground, between-school, and within-school characteristics to achievement gaps. James Coleman and colleagues(Coleman et al., 1966) devoted much attention to these issues in Chapter 3 of their now famous ‘‘Equality of EducationalOpportunity’’ report. The authors report that less than 25% of the total variance in verbal achievement scores for whitesand blacks is the result of between-school factors. Moreover, the most important school factors in explaining achievementare the social characteristics of the student body. Modern scholarly work has addressed these questions with the addedadvantage of longitudinal data and arrives at similar answers. Using multiple datasets, Berends et al. (2008) examineachievement gaps from 1972 to 2004, finding that changes in between-school factors such as school racial and SES compo-sition increased black–white achievement gaps, while changes in within-school factors such as track placement decreasedthese gaps. Cook and Evans (2000) report somewhat similar results using NAEP data from 1970 to 1988 and propose that 25%of the change in black–white test score gaps during this time can be attributed to family and between-school factors, whilethe remaining 75% can be attributed to within-school changes.

Thus, some evidence suggests that schools can influence black–white achievement gaps, although between-school factorsalso contribute to inequality. The effects of between-school factors on achievement gaps play a particularly important role inthis puzzle as the racial and SES compositions of school catchment areas directly influence both the level of school resources,through variations in funding from local property taxes, and the composition of the student body (Monk, 1981). High con-centrations of minority and low-income students in schools may harm student achievement and increase achievement gaps(Bankston and Caldas, 1996; Hanushek and Rivkin, 2009; Roscigno, 1998; Rumberger and Palardy, 2005; although see Cardand Rothstein, 2007; Lauen and Gaddis, 2013 for opposing views). These between-school factors represent a level of inequal-ity that is not directly addressable by administrators and teachers at individual schools. Furthermore, these factors may re-strict the available options to address black–white achievement gaps for administrators and teachers at some schools whileexpanding the available options for those in other schools.

2.2. How can accountability pressure lead schools to narrow achievement gaps?

The first declaration of the No Child Left Behind Act (NCLB) made clear its intention to reduce achievement gaps: ‘‘An Act:To close the achievement gap with accountability, flexibility, and choice, so that no child is left behind’’ (No Child Left Behind

Page 3: School accountability and the black–white test score gap

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 17

Act of 2001). With the introduction of this legislation, the federal government required states to set proficiency targets formath and reading with the end goal of 100% proficiency by 2014. Each state was also required to publish yearly informationon each individual school’s performance in meeting Adequate Yearly Progress (AYP). The law also held schools accountablefor a number of subgroups (e.g. black, Hispanic, poor, and special education, among others) depending on state demograph-ics and variations in implementation of the law. With these subgroup rules, NCLB intended to dissuade schools from ignoringtypically underserved groups of students. Failure for any subgroup was defined as failure for the entire school. The law aimedto reduce achievement gaps and improve overall student achievement through sanctions and negative designations. Schoolsthat received federal Title I funding were at risk of sanctions if they failed AYP for at least two consecutive years. These sanc-tions included offering students public school choice, tutoring, and the remote possibility of takeover or reconstitution as acharter school. All schools, however, faced the ‘‘naming and shaming’’ of public reports of AYP failure, including subgroupdata.

Theory and research on accountability systems suggests that organizations respond to accountability indicators in bothintended and unintended ways (Campbell, 1979; Espeland and Sauder, 2007; Sauder and Espeland, 2009). Although a self-fulfilling prophecy may arise whereby organizations are unable to maintain or increase resources due to negative stigmafrom accountability indicators, organizations may also shift their focus to accountability indicators, also known as commen-suration (Espeland and Sauder, 2007). Thus, it stands that accountability pressure may have positive effects, negative effects,or both, depending on other institutional characteristics (Chatterji and Toffel, 2010).

Prior research on educational accountability systems points to a number of possible mechanisms for changes in studentachievement, either through intended or unintended means (Nichols and Berliner, 2007; Ravitch, 2010: Chapter 8; Rothsteinet al., 2008: Chapter 4). Positive or intended mechanisms of accountability on achievement include increased effort and pro-ductivity from educators (Reback et al., 2011), more standardization across classrooms (Hallett, 2010; Spillane et al., 2011),and greater alignment between curriculum frameworks and the content of what is taught in classrooms (Brown and Clift,2010; Spillane et al., 2011). Alternatively, research finds that accountability systems may lead to reclassification or exemp-tion of certain students to remove them from the test pool that counts toward accountability measures, something NCLB’ssubgroup rules attempted to prevent (Cullen and Reback, 2006; Figlio and Getzer, 2006; Jacob, 2005; Jennings and Beveridge,2009). Others report instances of teacher cheating to raise student test scores in response to accountability (Jacob and Levitt,2003; Dewan, 2010), curriculum narrowing (Diamond, 2007), and differential resource distribution (Booher-Jennings, 2005;Neal and Schanzenbach, 2010). The possibility of these unintended consequences raises doubts about the true meaning ofthe effect of accountability pressure on achievement gaps based on high-stakes tests, although recent research finds somepositive effects of accountability pressure on low-stakes tests (Dee and Jacob, 2011; Reback et al., 2011; Wong et al., 2010).

More specific than system or school-wide measures, subgroup targets incentivize schools to focus on the groups of stu-dents who need the most assistance in improving their test scores. Teachers and administrators may respond to these sub-group targets and the attached sanctions and create different learning experiences for different groups in the process. It ispossible that this may alter achievement gaps through a negative effect on one group’s average test score and a positive ef-fect on another group’s average test score, or simply a larger positive effect for one group over the other.

NCLB like many other education policies focused on changes within schools rather than between schools. Schools wereexpected to effect change with no additional resources from national, state, or local government. While social scientists areconcerned whether schools can do anything to reduce achievement gaps, policymakers still put the emphasis on within-school processes. Some qualitative evidence suggests that schools with more resources may be better positioned to addressaccountability pressure (Brown and Clift, 2010; Jennings, 2010). Thus, heterogeneity in the effects of accountability pressureacross schools is a distinct possibility.

2.3. Evidence from the era of accountability

Despite the stated intentions of NCLB, there has been little research on the progress of accountability policy in eliminatingachievement gaps. Studies focus on either overall, not subgroup-specific, accountability pressure on achievement gaps(Harris and Herrington, 2004; Lee, 2006) or NCLB subgroup-specific pressure on achievement only for members of that sub-group (Lauen and Gaddis, 2012). No study uses subgroup-specific accountability pressure measures to examine the effects ondifferent groups and trends in achievement gaps.

Research that focuses on national trends prior to NCLB finds mixed results regarding the effects of state accountabilitypolicies. These studies suggest either minimal decreases in achievement gaps (Harris and Herrington, 2004) or no effecton achievement gaps (Hanushek and Raymond, 2005; Lee and Wong, 2004). A meta-analysis on much of this research sug-gests no overall significant effect on racial achievement gaps (Lee, 2008).

Additional studies that examine achievement gaps in the post-NCLB period find mixed results as well. Some results fromboth NAEP and state assessments show no significant changes to the trajectory of racial achievement gaps after the intro-duction of NCLB (Lee, 2006). However, this research only examines general time trends at the national level and does notaccount for differences in accountability pressure across schools. Other studies that examine individual-level data duringthe post-NCLB period suggest unknown directional effects for accountability pressure on black–white achievement gapsdue to inconsistent results for individuals members of each racial group (Dee and Jacob, 2011; Figlio et al., 2009; Krieg,2011; Lauen and Gaddis, 2012). For instance, Dee and Jacob (2011) find that overall NCLB pressure had larger effects forblacks than whites in math but larger effects for whites than blacks in reading.

Page 4: School accountability and the black–white test score gap

18 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

3. Research questions

Sociological research suggests that family background is a more important factor than the ability of schools to alterachievement trajectories in determining black–white test score gaps. However, education policies such as NCLB targetachievement gaps through school interventions instead of through family interventions or other means prior to schooling.From a social science perspective it is important to know whether schools can reduce achievement gaps; from an educationpolicy perspective it is important to know whether we are investing in effective policies. We contribute to these literatureson achievement gaps and educational accountability policy by investigating whether schools responded to NCLB subgroup-specific pressure to reduce achievement gaps. Addressing these issues with data at the school level accurately captures theorganizational level responses to an organizational level problem.

Our first two research questions are:

(1) Did NCLB black subgroup-specific accountability pressure narrow black–white achievement gaps in math andreading?

(2) If accountability pressure narrowed achievement gaps, how did changes in the gap occur (an increase, decrease, or nochange in one or both groups’ test scores)?

Additionally, the focus of education policy on within-school processes and policies rather than between-school redistri-bution of resources along with existing research on the contribution of different factors to racial achievement gaps leads usto investigate the existence of heterogeneous effects across schools. Thus, our third and final research question is:

(3) Did the effects of NCLB black subgroup-specific accountability pressure, if any, differ by school racial and povertycomposition?

4. Data and method

4.1. Sample

This project uses test score and related data from multiple cohorts of students in grades 3–8 in North Carolina between2001 and 2009. The full dataset contains about 6.2 million student-year observations but we address these questions from aschool-level perspective, thus we aggregate the student-level data to obtain school-by-year data. The number of schools inour data varies from 1756 in 2001 to 1891 in 2009. The full school-level dataset contains a total of 16,309 school-year obser-vations. Due to both our sample restrictions for stable black–white estimates and NCLB accountability rules (more below),our final analysis dataset contains 8836 school-year observations.

4.2. Dependent variables

Students in grades 3–8 in North Carolina complete math and reading tests at the end of each year. To create our depen-dent variables, we standardize each student’s score by grade and year, create yearly school means by race, and finally cal-culate the yearly gap (defined as school mean white score – school mean black score). To address the problem of unstableswings in white mean scores and gaps in schools with low numbers of white students, we only calculate mean math andreading test scores for a school-year observation when a school contains at least 20 white students.1 North Carolina is a par-ticularly appropriate state for this analysis because its student population has considerable numbers of blacks and whites,2 itstests are vertically equated, interval scaled, and directly linked to a statewide curriculum, have been in place since the early1990s, and were the outcome for which schools were held accountable under NCLB.

4.3. Independent variables

The NCLB variables are one-year lags of black subgroup-specific accountability pressure variables obtained by thestate’s department of public instruction and used to determine Adequate Yearly Progress (AYP) status. In each post-NCLByear, the black subgroup met the target, failed the target, or was not accountable for the target. North Carolina’s minimumsubgroup size was 40 students. Schools with too few black students in a given year were not rated for their black students’achievement. Thus, we do not include those school-year observations in our estimates. Schools were accountable for theblack subgroup in 10,101 of 16,309 school-year observations. The percentages of schools that were accountable for theblack subgroup target in math for each year range from 60% to 66% per year and include 68–73% of the entire student

1 Low numbers of black students were not a problem because a school was only accountable for the black subgroup under NCLB subgroup-specific rules if ithad at least 40 black students. We also report estimates using a minimum of 10 white students and estimates using a minimum of 40 white students in theappendix, but differences are minimal.

2 The underlying student population in our data is approximately 30% black and 59% white.

Page 5: School accountability and the black–white test score gap

Table 1Descriptive statistics.

Math Reading

Mean SD Mean SD

Dependent variablesWhite mean standardized test score 0.244 0.370 0.254 0.331Black mean standardized test score �0.478 0.236 �0.431 0.227Black–white standardized gap 0.723 0.310 0.684 0.292

Accountability pressure variablesBlack subgroup failed 0.614 0.487 0.482 0.500PostNCLB 0.652 0.476 0.652 0.476PostNCLB�Black SG failed 0.376 0.484 0.240 0.427

School racial and poverty composition variablesNumber of black students

Lowest quartile 0.251 0.433 0.251 0.433Middle 2 quartiles 0.499 0.500 0.499 0.500Highest quartiles 0.250 0.433 0.250 0.433

Number of white studentsLowest quartile 0.252 0.434 0.252 0.434Middle 2 quartiles 0.499 0.500 0.499 0.500Highest quartiles 0.249 0.432 0.249 0.432

Percentage of poor studentsLowest quartile 0.250 0.433 0.250 0.433Middle 2 quartiles 0.498 0.500 0.498 0.500Highest quartiles 0.251 0.434 0.251 0.434

Black student poverty ratioLowest quartile 0.250 0.433 0.250 0.433Middle 2 quartiles 0.498 0.500 0.498 0.500Highest quartiles 0.251 0.434 0.251 0.434

White student poverty ratioLowest quartile 0.251 0.433 0.251 0.433Middle 2 quartiles 0.498 0.500 0.498 0.500Highest quartiles 0.251 0.433 0.251 0.433

Note: N = 8836.

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 19

body per year.3 Our binary indicator for black subgroup failure is coded ‘‘1’’ if the school was accountable and failed the tar-get, coded ‘‘0’’ if the school was accountable and met the target, and ‘‘missing’’ if the school was not rated due to too fewblack students. In our regression estimates, we examine the effects of the prior year’s accountability rating on current year’stest score because accountability ratings were reported over the summer on the basis of spring testing. Although schools didnot have an official rating of black subgroup performance prior to the implementation of NCLB, it is nonetheless possible thatschools were already working to reduce achievement gaps. Thus we adjust for the propensity of schools to reduce achieve-ment gaps prior to NCLB by using a difference-in-difference modeling strategy (more below). To do so, we derive black sub-group-specific accountability variables for the pre-NCLB period by determining for each school whether there were 40 ormore black students and whether the black students’ average passing percentage in math or reading was lower than the ini-tial 2003 NCLB targets (74.6% in math and 68.9% in reading). We create the same style of accountability ratings for schools inthe pre-NCLB period: schools with less than 40 black students were not rated, coded ‘‘1’’ if accountable and failed the targetand coded ‘‘0’’ if accountable and met the target.4

Our other school-level independent variables of interest measure student racial and poverty composition: number ofblack students, number of white students, percent of poor students (those receiving free or reduced price lunch), theblack student poverty ratio (number of poor blacks over number of non-poor blacks), and the white student povertyratio.5 We divide each of these variables into four quartiles and group the middle two quartiles together to aid in

3 Our additional restriction of at least 20 white students per school and year reduces our final estimate sample size to 51–59% of the schools per year and 60–65% of the total students per year.

4 Our approach does not take into account safe harbor (which gave schools credit for a failing subgroup if the students in the subgroup had particularlystrong test score growth) or the confidence interval (which gave schools credit for being within the 95% confidence interval of a target), however, our proxiesare accurate. For example, our approach correctly classifies 95% of schools’ reported 2003 black subgroup accountability ratings.

5 Our choice to use counts for racial characteristics and count ratios for the poverty characteristics stems from the subgroup rules of NCLB (a minimum of 40students in a subgroup for a school to be accountable). Alternatively, we could use percentages instead of counts, which is a slightly different metric, however,the substantive results do not differ for any of our models (results available from authors upon request).

Page 6: School accountability and the black–white test score gap

20 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

interpretability of our differential effects models, described below.6 Descriptive statistics for all independent and depen-dent variables are shown in Table 1.7

4.4. Analytic strategy

To address our research questions, we use school panel data to estimate difference-in-difference (DD) models. This is anappropriate modeling strategy to examine the effects of policy implementation among treatment and control groups whilecontrolling for other general period effects (Bertrand et al., 2004; Imbens and Wooldridge, 2009; Meyer, 1995) that in recentyears has been used by sociologists in a variety of subfields (e.g. Cho, 2011; Rauscher, 2011; Shafer and Malhotra, 2011). TheDD strategy allows the researcher to model the change in the difference between treatment and control groups from the pre-period to the post-period. In our case, a school is in the treatment group when it fails AYP for the black subgroup and in thecontrol group when it meets AYP for the black subgroup. Using our derived AYP subgroup variables from the pre-NCLB per-iod, we estimate a DD model to compare changes for the treatment group across periods to changes for the control groupacross periods. Thus, the model tests whether any change in the black–white achievement gap can be attributed to blacksubgroup-specific accountability pressure:

6 Alte7 Mis

an inde8 Rec

schoolsto a pri

Gst ¼ b0 þ b1Ts;t�1 þ b2Postt þ b3Ts;t�1XPostt þ est ð1Þ

In this equation we regress the standardized black–white gap (G) for school s at time t on T, a school-level, time-varying,black subgroup-specific accountability threat variable, coded 1 if school s failed the AYP black subgroup target in the prioryear, and 0 otherwise; Post, an indicator coded 1 if the year is 2004–2009, and 0 if the year is 2001–2003; and the interactionof T and Post.8 The accountability effect in Eq. (1) is b3, the additional effect of NCLB sanctions in the post-period. If NCLB sanc-tions are effective, we would expect a larger failed AYP effect in the post-period than in the pre-period:

b̂3 ¼ ð�yT¼1;post � �yT¼1;preÞ � ð�yT¼0;post � �yT¼0;preÞ ð2Þ

If b3 is negative, the treatment group had a larger reduction in the achievement gap over time than the control group.Including the difference over time in the control group allows us to control for a general time trend over this period, suchas pre-NCLB interventions to reduce black–white achievement gaps. For instance, if the general math black–white achieve-ment gap was on a downward trajectory in North Carolina prior to the implementation of NCLB, an identification strategythat relies on a difference-in-difference is a preferable methodological approach. In this example, a simple treatment versuscontrol strategy might overestimate the effect of NCLB pressure and a pre- versus post-comparison might underestimate theeffect.

We include a vector of school fixed effects, /s to Eq. (1) to remove between-school heterogeneity that would otherwisebias the accountability effect. This approach estimates the effect of black subgroup-specific accountability pressure on suc-cessive cohorts of students in the same school. An OLS estimate without school fixed effects could produce biased estimatesbecause it would fail to account for exogenous unobserved differences between schools. While this approach ignores the per-formance of schools with static treatment statuses, we view comparing the performance of cohorts exposed to black sub-group-specific accountability pressure to cohorts not exposed to such pressure who attended the same school at differentpoints in time as more valid than comparing the performance of students in different schools. Time-varying school-level con-founders that differ in the pre- and post-periods remain threats to the validity of our findings. Including school fixed effectsbuys us some, but not complete protection from all possible sources of confounding. Additionally, we control for individualyear effects to account for overall trends in changes in achievement gaps:

Gst ¼ b0 þ b1Ts;t�1 þ b2Postt þ b3Ts;t�1XPostt þ b4Yeart þ k/s þ est ð3Þ

Finally, we examine whether the black subgroup-specific accountability effect differs by school racial or poverty compo-sition by interacting the treatment effect with moderating variables (LM and HM, below) and including all two-wayinteractions:

Gst ¼ b0 þ b1Ts;t�1 þ b2Postt þ b3LMst þ b4HMst þ b5Ts;t�1XLMst þ b6Ts;t�1XHMst þ b7PosttXLMst þ b8PosttXHMst

þ b9Ts;t�1XPostt þ b10Ts;t�1XPosttXLMst þ b11Ts;t�1XPosttXHMst þ b14Yeart þ k/s þ est ð4Þ

This equation is similar to Eq. (3), but in Eq. (4) the main accountability effect (b9) now represents the effect of failing theblack subgroup for schools in the middle two quartiles of the moderator (school racial or poverty composition) and the twothree-way interaction effects (b10 and b11) represent the differential accountability effect for schools in the lowest and high-est quartiles of the moderator. Essentially, b10 and b11 are additive (or subtractive) effects of b9 for example, in a model withschool poverty as the moderator, if b10 is negative, treatment schools in the lowest quartile of poverty composition would

rnate models using continuous measures of these variables when used as controls yield similar results.sing data is not a major concern in our sample because we aggregate to the school level. In the few cases where data is missing at the individual level onpendent variable, we impute individual panel means before aggregating to the school level.all that we use lagged accountability pressure variables because accountability status from the prior period was announced during the summer, thusresponded to pressure from the prior year’s results. We use 2004 as the first post-NCLB year because it was the first year in which schools could respondor accountability rating.

Page 7: School accountability and the black–white test score gap

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 21

have a larger reduction in the achievement gap over time than treatment schools in the middle quartiles of poverty compo-sition. Thus, if in a model examining the differential effect of percent of poor students we find that both b9 and b10 are neg-ative and significant, the results would suggest that the accountability effect in schools in the middle two quartiles of percentof poor students reduces the black–white test score gap, but that the accountability effect in schools in the lowest quartile ofpercent of poor students reduces the black–white test score gap by an even greater amount.

4.5. Descriptive analysis

In Fig. 1, we examine cross-tabulated evidence of the reduction of test score gaps during the NCLB period in North Car-olina. This figure shows math and reading black–white gaps by year pooled across all schools which were accountable for theblack subgroup. A few notable points stand out in this figure. First, although black–white achievement gaps in both math andreading were on a downward trend in North Carolina throughout this time period, the reduction in the math gap was slightlylarger. Second, the largest reduction in the math gap occurred during the post-NCLB period, between 2006 and 2008. Finally,the reading gap is mostly stagnant in the post-NCLB period. Although this figure is based on simple cross-tabulations, it indi-cates some interesting changes in gap trends during the period of study and suggests that NCLB may have contributed toreductions in math and reading test score gaps.

In Fig. 2, we examine the same black–white trends for math but include school racial and poverty composition measuresto see how these trends vary in different types of schools. Each sub-figure shows three lines, one each for the lowest quartile,middle two quartiles, and highest quartile of the school racial or poverty composition measure (denoted as low, mid, or highin the figures). Fig. 2A shows that there appears to be no real difference in the black–white gap intercept or trend based onthe number of black students in a school. Fig. 2B shows a lower black–white gap in schools with a low number of white stu-dents compared to schools with middle or high numbers of white students, but the trends between these type of schools aresimilar. Fig. 2C shows a lower black–white gap in schools with a high ratio of poor to non-poor black students compared toschools with low or middle black poverty ratios, but the high black poverty ratio schools have a stagnant gap trend while lowand especially middle black poverty ratio schools have a downward gap trend over the period. Fig. 2D, the most dramaticfigure, shows a much higher black–white gap in schools with a low ratio of poor to non-poor white students compared toschools with middle or high white poverty ratios. Moreover, schools with a low white poverty ratio have a sharp downwardgap trend between 2005 and 2007 while middle white poverty ratio schools have a stagnant gap trend and high white pov-erty ratio schools have a slight downward gap trend. Finally, Fig. 2E shows that schools with a low percentage of poor stu-dents have the highest black–white gap, followed by schools with a middle percentage of poor students, and then schoolswith the highest percentage of poor students have the lowest black–white gap. The gap trends downward in each of thesetypes of schools with larger reductions in schools with a low percentage of poor students. Overall, this figure suggests thatschool poverty characteristics are probably more influential than school racial characteristics in explaining black–white gapsand trends. Thus, schools with more affluent populations may be more able to reduce black–white achievement gaps. Wenow turn to regression analyses to investigate further and explore the reasons behind the reduction in these gaps.

5. Results

5.1. Main accountability effects

In Table 2 we present the regression results from our main accountability effect models. This table addresses our first tworesearch questions: (1) Did NCLB black subgroup-specific accountability pressure contribute to a narrowing, widening, or

Fig. 1. Black-white test score gaps by year and subject.

Page 8: School accountability and the black–white test score gap

Fig. 2. Black-white test score gaps by year and school composition variables. Note: Low refers to schools in the lowest quartile of a characteristic, mid refersto schools in the middle two quartiles of a characteristic, and high refers to schools in the highest quartile of a characteristic.

22 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

stagnation of black–white achievement gaps in math and reading? and (2) If there were effects on achievement gaps, howdid changes in the gaps occur (an increase, decrease, or stagnation of one or more groups’ test scores)? Column 1 shows theresults for mean white test score as the dependent variable, column 2 shows the results for mean black test score as thedependent variable, and column 3 shows the results for the black–white test score gap as the dependent variable. PanelA presents results for math and panel B presents results for reading. All models shown include school level controls (percentpoor students, number of black students, and number of white students, all divided into quartiles) and school fixed effects.

The Post-NCLB � Black SG Fail interaction coefficient (b3) is the accountability effect. Recall that this DD estimate com-pares standardized test scores before and after the introduction of NCLB between schools that met the black subgroup targetand schools that failed the black subgroup target. In math, the results indicate that there is a marginally significant account-ability effect on mean white test scores (0.016), a positive accountability effect on mean black test scores (0.050), and a neg-ative accountability effect on the black–white test score gap (�0.034). In reading, the results indicate that there is a positiveaccountability effect on mean white test scores (0.016), a positive accountability effect on mean black test scores (0.060), anda negative accountability effect on the black–white test score gap (�0.044). Thus, NCLB subgroup-specific accountabilitypressure in schools failing for the black subgroup has positive effects for black students in math and both white and blackstudents in reading, but the effects are larger for black students leading to a reduction in the achievement gap in bothsubjects.

Page 9: School accountability and the black–white test score gap

Table 2Models predicting mean achievement for whites and blacks, and the black–white gap.

(1) White mean std. test score (2) Black mean std. test score (3) Black–white gap

A. MathBlack SG Fail (b1) �0.057*** �0.082*** 0.025***

(0.0070) (0.0076) (0.0067)PostNCLB (b2) �0.043*** �0.029*** �0.014*

(0.0058) (0.0065) (0.0056)PostNCLB�Black SG Fail (b3) 0.016+ 0.050*** �0.034***

(0.0084) (0.0088) (0.0080)Constant 0.338*** �0.434*** 0.772***

(0.0127) (0.0135) (0.0124)Observations 8836 8836 8836R2 0.054 0.045 0.033

B. ReadingBlack SG Fail (b1) �0.044*** �0.112*** 0.068***

(0.0062) (0.0075) (0.0076)PostNCLB (b2) �0.029*** �0.049*** 0.020**

(0.0052) (0.0062) (0.0061)PostNCLB�Black SG Fail (b3) 0.016* 0.060*** �0.044***

(0.0072) (0.0082) (0.0083)Constant 0.307*** �0.385*** 0.693***

(0.0105) (0.0126) (0.0125)Observations 8836 8836 8836R2 0.036 0.080 0.035

Note: To be included in the sample a school must have at least 20 white students (authors’ restriction) and at least 40 black students to qualify asaccountable under NCLB. Each regression model includes year and school fixed effects. School controls include percent poor students, number of blackstudents, and number of white students (all divided into quartiles). Cluster-corrected standard errors in parentheses.

+ p < 0.10.* p < 0.05.** p < 0.01.*** p < 0.001.

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 23

5.2. Accountability effects by school racial and poverty composition

Table 3 presents the results for math from the main accountability effect model (1) beside five models examining differ-ences in the accountability effect by school racial and poverty composition (2–6). Tables 3 and 4 extend our examination ofour first two research questions and address our third research question: (3) Did the effects of NCLB black subgroup-specificaccountability pressure, if any, differ by student racial and poverty composition? In the following tables we include only theresults of NCLB black subgroup-specific accountability pressure on achievement gaps for brevity. The finding that black stu-dents gain more from NCLB subgroup-specific accountability pressure than white students is consistent across the moder-ator models as well. Although the main accountability effects on white mean test scores are not always significant in themoderator models, they are never negative. This suggests that the black–white test score gap is never reduced due to low-ered performance from white students but instead is always reduced due to heightened performance from black students. Afull set of tables is available from the authors upon request.

In model 2 of Table 3, we find a negative accountability effect (�0.049) on the black–white math test score gap in schoolsin the middle two quartiles of numbers of black students. The differential effect for schools in the lowest quartile of blackstudents is positive (0.033) but only marginally significant (p < 0.10). Model 3 is similar to model 2 but examines differentialaccountability effects based on the number of white students in a school. We find no significant differential accountabilityeffects in this model. Model 4 examines the differential accountability effect by a school’s percentage of poor students. In thiscase, the main accountability effect for schools in the middle two quartiles of percentage of poor students is negative(�0.028) and significant. Moreover, the differential accountability effect for schools in the lowest quartile of percentageof poor students is also negative (�0.033) and significant, suggesting that the accountability effect on the black–white mathtest score gap is largest in schools with the fewest poor students. In model 5 we find no significant differential accountabilityeffect based on the black poverty ratio in a school. However, in model 6 we find the largest differential accountability effectfrom all of our models: schools in the lowest quartile of the ratio of white poor students to white non-poor students have adifferential effect of �0.060. The main accountability effect in model 6 is negative (�0.013) but not statistically significant.

In Table 4 we present similar results for reading, again from the main accountability effect model (1) and five modelsexamining differences in the accountability effect by student racial and poverty composition (2–6). As in the math regres-sions, models 3 and 5 show no significant differential accountability effects of the number of white students (3) or the blackpoverty ratio (5) on the black–white reading test score gap. In model 2, we find a negative accountability effect (�0.051) onthe black–white reading test score gap in schools in the middle two quartiles of numbers of black students, but no statisti-cally significant differential effects. Model 4 examines the differential accountability effect by a school’s percentage of poor

Page 10: School accountability and the black–white test score gap

Table 3Models predicting the black–white math gap.

1 2 3 4 5 6Main # Black # White % Poor Black PovRatio White PovRatio

Black SG Fail (b1) 0.025*** 0.033*** 0.031*** 0.022* 0.017* 0.004(0.0067) (0.0097) (0.0089) (0.0088) (0.0084) (0.0082)

PostNCLB (b2) �0.014* �0.009 �0.017* �0.015* �0.011 �0.021**

(0.0056) (0.0079) (0.0075) (0.0074) (0.0077) (0.0069)PostNCLB�Black SG Fail �0.034*** �0.049*** �0.034*** �0.028** �0.030** �0.013(b3) (0.0083) (0.0121) (0.0114) (0.0112) (0.0112) (0.0103)

ModeratorsPostNCLB�Black SG Fail 0.033+

�LowQNumBlack (0.0198)PostNCLB�Black SG Fail 0.024�HighQNumBlack (0.0202)PostNCLB�Black SG Fail 0.020�LowQNumWhite (0.0226)PostNCLB�Black SG Fail �0.013�HighQNumWhite (0.0161)PostNCLB�Black SG Fail �0.033*

�LowQPctPoor (0.0164)PostNCLB�Black SG Fail 0.011�HighQPctPoor (0.0195)PostNCLB�Black SG Fail �0.014�LowQBlackPovRatio (0.0180)PostNCLB�Black SG Fail 0.012�HighQBlackPovRatio (0.0193)PostNCLB�Black SG Fail �0.060***

�LowQWhitePovRatio (0.0173)PostNCLB�Black SG Fail �0.009�HighQWhitePovRatio (0.0197)

Constant 0.772*** 0.768*** 0.770*** 0.771*** 0.751*** 0.758***

(0.0124) (0.0137) (0.0132) (0.0125) (0.0115) (0.0115)Observations 8836 8836 8836 8836 8836 8836R2 0.033 0.034 0.034 0.034 0.054 0.057

Note: Coefficients are from school fixed effects models similar to Table 2, column 3, panel A. To be included in the sample a school must have at least 20white students (authors’ restriction) and at least 40 black students to qualify as accountable under NCLB. Each regression model includes year and schoolfixed effects. Models 1–4 each include main effects for percent poor students, number of black students, and number of white students (all divided intoquartiles). Models 5 and 6 each include main effects for the other racial group’s poverty ratio (divided into quartiles). Cluster-corrected standard errors inparentheses.

+ p < 0.10.* p < 0.05.** p < 0.01.*** p < 0.001.

24 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

students and we find that the main accountability effect for schools in the middle two quartiles of percentage of poor stu-dents is negative (�0.047) and significant, but there are no statistically significant differential effects. Finally, in model 6 wefind that the main accountability effect for schools in the middle two quartiles of the white poverty ratio is negative (�0.036)and significant, while the differential effect for schools in the lowest quartile of the white poverty ratio is also negative(�0.036) and significant. This finding suggests that largest effects on the black–white reading test score gap from any modelare found in treatment schools with the lowest white poverty ratio.

In summary, these models indicate that NCLB black subgroup-specific accountability pressure has a negative effect onboth math and reading black–white test score gaps. There is minimal evidence to suggest that the racial composition ofschools alone alters this effect, as we find only one of four differential coefficients in the number of black or white studentsmodels is even marginally significant.9 Conversely, school poverty seems to play an important role in this story. In math,schools in the lowest quartile of percentage of poor students and schools in the lowest quartile of the white poverty ratio havethe largest accountability effects on the black–white test score gap. In reading, schools in the lowest quartile of the white pov-erty ratio have the largest accountability effects on the black–white test score gap. However, we would be remiss in our at-tempts to uncover the full story of accountability effects on the black–white achievement gap if we stopped here. Recallfrom Fig. 2 that school racial and poverty composition also indicate different starting positions in our base year or interceptsof black–white test score gaps. Thus, these differential effects may simply be a product of their unequal starting positions.To address this issue, we estimate a predicted value of the post-NCLB black–white gap for each composition variable and divide

9 Alternate specifications of model inclusion based on the number of white students in a school suggest these results are robust on a number of dimensions.See Appendix Tables A1 and A2.

Page 11: School accountability and the black–white test score gap

Table 4Models predicting the black–white reading gap.

1 2 3 4 5 6Main # Black # White % Poor Black PovRatio White PovRatio

Black SG Fail (b1) 0.068*** 0.075*** 0.070*** 0.065*** 0.063*** 0.052***

(0.0076) (0.0109) (0.0097) (0.0096) (0.0098) (0.0088)PostNCLB (b2) 0.020** 0.020* 0.015+ 0.020** 0.024** 0.013+

(0.0061) (0.0087) (0.0079) (0.0077) (0.0083) (0.0074)PostNCLB�Black SG Fail �0.044*** �0.051*** �0.045*** �0.047*** �0.047*** �0.036**

(b3) (0.0083) (0.0120) (0.0113) (0.0114) (0.0107) (0.0104)

ModeratorsPostNCLB�Black SG Fail 0.030�LowQNumBlack (0.0199)PostNCLB�Black SG Fail �0.002�HighQNumBlack (0.0196)PostNCLB�Black SG Fail �0.001�LowQNumWhite (0.0238)PostNCLB�Black SG Fail �0.002�HighQNumWhite (0.0168)PostNCLB�Black SG Fail �0.010�LowQPctPoor (0.0201)PostNCLB�Black SG Fail 0.035�HighQPctPoor (0.0223)PostNCLB�Black SG Fail 0.008�LowQBlackPovRatio (0.0200)PostNCLB�Black SG Fail 0.009�HighQBlackPovRatio (0.0204)PostNCLB�Black SG Fail �0.036*

�LowQWhitePovRatio (0.0168)PostNCLB�Black SG Fail 0.009�HighQWhitePovRatio (0.0192)

Constant 0.693*** 0.690*** 0.694*** 0.694*** 0.672*** 0.680***

(0.0125) (0.0141) (0.0136) (0.0130) (0.0125) (0.0120)Observations 8836 8836 8836 8836 8836 8836R2 0.035 0.036 0.035 0.036 0.059 0.062

Note: Coefficients are from school fixed effects models similar to Table 2, column 3, panel B. To be included in the sample a school must have at least 20white students (authors’ restriction) and at least 40 black students to qualify as accountable under NCLB. Each regression model includes year and schoolfixed effects. Models 1–4 each include main effects for percent poor students, number of black students, and number of white students (all divided intoquartiles). Models 5 and 6 each include main effects for the other racial group’s poverty ratio (divided into quartiles). Cluster-corrected standard errors inparentheses.

+ p < 0.10.* p < 0.05.** p < 0.01.*** p < 0.001.

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 25

each estimate by the predicted value of the black–white gap in the pre-NCLB period. The resulting number gives us the percent-age of the black–white test score gap reduced by the accountability effect.

In Fig. 3, we examine these percentage changes on the black–white math test score gap and two main trends stand out.First, schools do not differ significantly in the percentage of the black–white math test score gap reduced by the accountabil-ity effect based on the number of black students (2), number of white students (3), or black poverty ratio (5). Second, schoolsdiffer significantly in the percentage of the black–white math test score gap reduced by the accountability effect based on thepercentage of poor students (4), and white poverty ratio (6). In both of these cases, schools in the lowest quartile were sig-nificantly different than schools in the middle two or highest quartile(s). Accountability pressure in these schools reducedthe black–white math test score gap 9.1% and 10.5%, respectively.

In Fig. 4, we examine these adjusted effects on the black–white reading test score gap. We find that there are no signif-icant differences among the differential effects in any of the five models. Thus, any differential effects from the readingregression models are a product of the differences in intercepts between school types. Placing these effects on a relative scale(percentage of the pre-NCLB test score gap reduced) suggests that accountability pressure reduces the black–white achieve-ment gap in reading by similar percentages across schools.

6. Discussion

Scholars have focused much attention on black–white achievement gaps in an effort to understand how such gaps arise,how families and schools exacerbate or reduce gaps, and how inequalities between- and within-schools contribute to the

Page 12: School accountability and the black–white test score gap

Fig. 4. Adjusted Effect Size on Reading B-W Gap. Note: Calculations based on Table 4 model predictions using the margins command in Stata 12. Each barrepresents the DDD effect size over the pre-NCLB math B–W gap. � Indicates quartile group is significantly different from others at p < 0.05.

Fig. 3. Adjusted Effect Size on Math B-W Gap. Note: Calculations based on Table 3 model predictions using the margins command in Stata 12. Each barrepresents the DDD effect size over the pre-NCLB math B–W gap. � Indicates quartile group is significantly different from others at p < 0.05.

26 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

overall story. Understanding these issues is important to our understanding of racial inequality at a broader level becauseracial differences in academic achievement and subsequent attainment contribute to racial differences in other importantoutcomes such as health, mortality, employment, wages, crime, and incarceration (Grodsky and Pager, 2001; Haywardet al., 2000; Pampel, 2009; Pettit and Western, 2004; Phillips, 2002). Our study is the first to examine the effects of NCLBblack subgroup-specific accountability pressure on black–white achievement gaps. Because NCLB stated the eliminationof achievement gaps as an explicit goal, our study analyzes not only how schools respond to the pressure from such a man-date, but how school racial and poverty composition influence these responses.

Using data from third through eighth graders from the entire state of North Carolina between 2001 and 2009, we estimatea school fixed effects difference-in-difference model with derived variables in the pre-NCLB period for black subgroup fail-ure. We find negative main accountability effects in both math and reading, suggesting that subgroup-specific NCLB account-ability policy contributed to a narrowing of black–white achievement gaps. Furthermore, we find that these effects are due toan increase in black mean standardized math test scores and an increase (although larger in black) in both black and white

Page 13: School accountability and the black–white test score gap

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 27

mean standardized reading test scores. In other words, black students appear to gain from the attention that comes fromblack subgroup-specific accountability pressure while white students at a minimum are not harmed and sometimes gainas well. These results suggest that accountability pressure is the source of the decline in black–white achievement gaps be-cause (a) our DD model compares changes between treatment and control groups over time and thus accounts for the pro-pensity of schools to address black–white achievement gaps prior to NCLB, and (b) school fixed effects limit the variation towithin schools, thus we can control for fixed school characteristics and eliminate between-school confounding.

Our models examining the number of white or black students in a school indicate no differential accountability effectsbased on those simple counts. However, we find consistent evidence of differential accountability effects based on schools’poverty composition, both measured as percentage of poor students and the white poverty ratio. Schools with the lowestpercentages of poor students and schools with the lowest white poverty ratios have the largest overall accountability effectson the black–white math test score gap, while schools with the lowest white poverty ratios also have the largest overallaccountability effect on the black–white reading test score gap. When we adjust for different intercepts of black–white gaps,the differential school poverty results for math, but not reading, remain significant. Thus, schools with more affluent stu-dents are closing the black–white math achievement gap, but not the reading gap, at faster rates than those with fewer afflu-ent students. This finding is consistent with prior research on the effects of accountability pressure on different subject tests(e.g. Ladd and Lauen, 2010; Reback, 2008), which suggests that schools may be able to direct resources toward improving allstudents’ low- and mid-level math skills, but reading is more likely to require pulling individual students out of classrooms.

These findings contribute to an important debate about whether schools can narrow achievement gaps. Our study sug-gests that schools can close achievement gaps and thus refutes some prior research, such as a key finding of the Colemanreport that suggests that schools are unable to alter the trajectories set by family differences, particularly for minority stu-dents (Coleman et al., 1966, p. 20). However, our study also is in line with other prior research about the ability of schools toaffect black–white achievement gaps (Condron, 2009; Downey et al., 2004) and echoes the finding of the Coleman report that‘‘the social composition of the student body is more highly related to achievement, independent of the student’s own socialbackground, than is any other school factor’’ (ibid: 325). Although we do not examine individual differences and family con-tributions to the black–white achievement gap, we find that gaps differ significantly by the poverty composition of schools.Moreover, schools’ ability to close gaps in response to accountability pressure also differs by the poverty composition ofschools. Schools with the most affluent populations are most able to respond to accountability pressure to reduce black–white math achievement gaps. These findings have serious implications for a wide range of actors including policymakers,educators, school board members, and researchers. However, it is important to note that one limitation of our research is ourinability to distinguish between the actions of individuals within schools or the actions of parents and students at home as aresponse to accountability pressure in low-poverty schools. The heterogeneous effects between low- and high-povertyschools may result from some parents responding with at-home or out-of-school activities that help boost children’s testscores. Additionally, more affluent parents at low-poverty schools may provide a variety of supplemental resources toschools, blurring the line between school and family effects. Future research should work to identify these mechanisms.

Despite these important findings, our study is not without its shortcomings. First, although we draw on full data from allpublic schools in North Carolina, we are only able to demonstrate these effects in one state. Other states may have differentchallenges in addressing achievement gaps, responses to accountability pressure, or competing priorities. Unfortunately,nationally representative panel data are not available to address our research questions. Still, the North Carolina data givesus a good starting point to explore these questions. Even prior to the introduction of NCLB, North Carolina was considered astrong accountability state (Carnoy and Loeb, 2002). Since other research finds that weak accountability states have largereffects of accountability pressure than strong accountability states (Dee and Jacob, 2011), we suggest that our results areprobably a conservative estimate of what we might find in states with weak pre-NCLB accountability systems. Moreover,North Carolina is a relatively diverse state with urban school districts in areas such as Charlotte, Greensboro, and Raleigh,suburban districts, and small rural districts in the western mountain and eastern coastal portions of the state. The racialcomposition of the public school system in our data during this time period further reinforces the state’s diversity: 58%white, 29% black, and 8% Hispanic. Thus, although not nationally representative, North Carolina does not have a homoge-neous student population.

Additionally, our use of the percentage of students receiving free or reduced price lunch to represent school resources haslimitations. Because property taxes within a catchment area determine, to a large extent, a school’s resource pool, an indi-vidual-level poverty indicator aggregated to the school-level may not be a refined enough measure to capture the true pro-cess of differential accountability effects. However, using a within-state design with school fixed effects likely protects usfrom the biggest measurement problems this measure might have in a nationally representative dataset, such as differencesacross states in funding allocation strategies and cost of living.

A growing body of evidence hints that increasing between-school racial inequality and decreasing within-school racialinequality may be driven by social class factors. Researchers find that over the past five decades, residential segregationby income has sharply increased (Reardon and Bischoff, 2011). Due to school assignment policies, this spatial arrangementpotentially leads to more SES similarities within schools and more SES differences between schools. In other words, blackstudents may be more similar to white students in their own schools (within) than they are similar to black students in otherschools (between). This would suggest that schools with the lowest poverty levels might have the greatest success inreducing black–white achievement gaps since accountability policy puts pressure on schools to alter these gaps, often with-out a large influx of additional resources or changes in school assignment policy. Moreover, recent work suggests that

Page 14: School accountability and the black–white test score gap

Table A1Models predicting the black–white math gap (Alternate Cutoff).

1 2 3 4 5 6Main # Black # White % Poor Black PovRatio White PovRatio

Black SG Fail (b1) 0.023*** 0.030** 0.027** 0.019* 0.016+ 0.006(0.0066) (0.0095) (0.0087) (0.0088) (0.0084) (0.0079)

PostNCLB (b2) �0.014* �0.011 �0.018* �0.015* �0.016* �0.017*

(0.0057) (0.0078) (0.0076) (0.0075) (0.0079) (0.0070)PostNCLB�Black SG Fail �0.033*** �0.044*** �0.032** �0.027* �0.030** �0.018*

(b3) (0.0080) (0.0108) (0.0107) (0.0108) (0.0097) (0.0092)ModeratorsPostNCLB�Black SG Fail 0.035+

�LowQNumBlack (0.0202)PostNCLB�Black SG Fail 0.003�HighQNumBlack (0.0192)PostNCLB�Black SG Fail 0.015�LowQNumWhite (0.0255)PostNCLB�Black SG Fail �0.010�HighQNumWhite (0.0151)PostNCLB�Black SG Fail �0.033*

�LowQPctPoor (0.0157)PostNCLB�Black SG Fail 0.009�HighQPctPoor (0.0190)PostNCLB�Black SG Fail �0.016�LowQBlackPovRatio (0.0179)PostNCLB�Black SG Fail 0.013�HighQBlackPovRatio (0.0192)PostNCLB�Black SG Fail �0.052**

�LowQWhitePovRatio (0.0166)PostNCLB�Black SG Fail 0.010�HighQWhitePovRatio (0.0201)

Constant 0.779*** 0.776*** 0.778*** 0.778*** 0.769*** 0.771***

(0.0122) (0.0134) (0.0128) (0.0124) (0.0113) (0.0108)Observations 7954 7954 7954 7954 7954 7954R2 0.037 0.038 0.038 0.038 0.062 0.066

Note: Coefficients are from school fixed effects models. To be included in the sample a school must have at least 40 white students and at least 40 blackstudents to qualify as accountable under NCLB. Each regression model includes year and school fixed effects. Models 1–4 each include main effects forpercent poor students, number of black students, and number of white students (all divided into quartiles). Models 5 and 6 each include main effects for theother racial group’s poverty ratio (divided into quartiles). Cluster-corrected standard errors in parentheses.

+ p < 0.10.* p < 0.05.** p < 0.01.*** p < 0.001.

28 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

socioeconomic gaps in achievement, which have widened over the last few decades, may be a more pressing concern for edu-cators than racial gaps (Reardon, 2011; Reardon, 2013). In future work, we plan to examine the effects of accountability pres-sure on socioeconomic achievement gaps as well.

Finally, we must be careful in our interpretation of what these effects actually mean for student learning and other aca-demic outcomes. Research on the unintended consequences of accountability suggests that a number of explanations, suchas cheating, teaching to the test, or reclassifying students, may account for the reduction of achievement gaps on high-stakestests (Booher-Jennings, 2005; Jacob, 2005; Jacob and Levitt, 2003). The true value of NCLB black subgroup-specific account-ability pressure in reducing achievement gaps might be seen through its effect on other outcomes that may be less corrupt-ible than high-stakes standardized tests, such as graduation rates, SAT scores, college enrollment rates, or even low-stakestests.

Still, our findings have important implications for scholars of educational and racial inequality, as well as policymakersand education leaders working to solve a persistent social problem. Our research clearly shows that NCLB black subgroup-specific accountability pressure can induce schools to make progress in reducing the black–white achievement gap withoutharming mean levels of white achievement. However, some schools are more able than others to successfully reduce theblack–white achievement gap in math due to the mechanisms of success. These findings suggest that individuals mustpay careful attention to the hypothesized mechanisms of program or policy effects when enacting strategies intended to re-duce racial inequality. What is not clear is whether increasing adjusting school composition in schools with less affluent stu-dents and lower starting black–white achievement gaps would allow those schools to reduce achievement gaps at the samerate as those with more affluent students. Although our research raises a number of new questions regarding the role ofschools in reducing achievement gaps, we determine that existing institutional inequality plays a significant role in the abil-ity of schools to reduce racial inequality.

Page 15: School accountability and the black–white test score gap

Table A2Models predicting the black–white math gap (Alternate Cutoff 2).

1 2 3 4 5 6Main # Black # White % Poor Black PovRatio White PovRatio

Black SG Fail (b1) 0.023** 0.031** 0.030*** 0.021* 0.013 0.004(0.0071) (0.0103) (0.0090) (0.0092) (0.0090) (0.0084)

PostNCLB (b2) �0.016** �0.012 �0.017* �0.015* �0.013+ �0.022**

(0.0060) (0.0087) (0.0076) (0.0077) (0.0081) (0.0069)PostNCLB�Black SG Fail �0.032*** �0.045*** �0.032*** �0.028* �0.028* �0.013(b3) (0.0085) (0.0124) (0.0108) (0.0112) (0.0112) (0.0098)ModeratorsPostNCLB�Black SG Fail 0.024�LowQNumBlack (0.0202)PostNCLB�Black SG Fail 0.025�HighQNumBlack (0.0211)PostNCLB�Black SG Fail 0.024�LowQNumWhite (0.0247)PostNCLB�Black SG Fail �0.014�HighQNumWhite (0.0157)PostNCLB�Black SG Fail �0.031+

�LowQPctPoor (0.0181)PostNCLB�Black SG Fail 0.016�HighQPctPoor (0.0211)PostNCLB�Black SG Fail �0.016�LowQBlackPovRatio (0.0185)PostNCLB�Black SG Fail 0.014�HighQBlackPovRatio (0.0190)PostNCLB�Black SG Fail �0.060***

�LowQWhitePovRatio (0.0175)PostNCLB�Black SG Fail �0.003�HighQWhitePovRatio (0.0206)Constant 0.774*** 0.770*** 0.769*** 0.772*** 0.751*** 0.756***

(0.0130) (0.0146) (0.0136) (0.0132) (0.0120) (0.0118)Observations 9243 9243 9243 9243 9243 9243R2 0.033 0.034 0.034 0.034 0.053 0.054

Note: Coefficients are from school fixed effects models. To be included in the sample a school must have at least 10 white students and at least 40 blackstudents to qualify as accountable under NCLB. Each regression model includes year and school fixed effects. Models 1–4 each include main effects forpercent poor students, number of black students, and number of white students (all divided into quartiles). Models 5 and 6 each include main effects for theother racial group’s poverty ratio (divided into quartiles). Cluster-corrected standard errors in parentheses.

+ p < 0.10.* p < 0.05.** p < 0.01.*** p < 0.001.

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 29

Acknowledgments

We thank participants in the UNC Chapel Hill Department of Sociology Inequality Workshop and the anonymous review-ers and editor of Social Science Research for helpful comments on earlier drafts of this article. We are also grateful to theNorth Carolina Education Research Data Center at Duke University for providing the data for this project. A previous versionof this article was presented at the Sociology of Education Association conference in Monterrey, CA in February 2011.

Appendix A

See Tables A1 and A2.

References

Alexander, Karl L., Entwisle, Doris R., Olson, Linda S., 2001. Schools, achievement, and inequality: a seasonal perspective. Educational Evaluation and PolicyAnalysis 23 (2), 171–191.

Alexander, Karl L., Entwisle, Doris R., Olson, Linda S., 2007. Lasting consequences of the summer learning gap. American Sociological Review 72 (2), 167–180.Bankston, Carl, Caldas, Stephen J., 1996. Majority African American schools and social injustice: the influence of de facto segregation on academic

achievement. Social Forces 75 (2), 535–555.Berends, Mark, Lucas, Samuel R., Penaloza, Roberto V., 2008. How changes in families and schools are related to trends in black–white test scores. Sociology

of Education 81 (4), 313–344.Bertrand, Marianne, Duflo, Esther, Mullainathan, Sendhil, 2004. How much should we trust differences-in-differences estimates? The Quarterly Journal of

Economics 119 (1), 249–275.Booher-Jennings, Jennifer, 2005. Below the bubble: ‘Educational Triage’ and the Texas accountability system. American Educational Research Journal 42 (2),

231–268.

Page 16: School accountability and the black–white test score gap

30 S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31

Brooks-Gunn, Jeanne, Klebanov, Pamela K., Smith, Judith, Duncan, Greg J., Lee, Kyunghee., 2003. The black–white test score gap in young children:contributions of test and family characteristics. Applied Developmental Science 7 (4), 239–252.

Brown, A.B., Clift, J.W., 2010. The unequal effect of adequate yearly progress: evidence from school visits. American Educational Research Journal 47 (4),774–798.

Burkam, David T., Ready, Douglas D., Lee, Valerie E., LoGerfo, Laura F., 2004. Social-class differences in summer learning between kindergarten and firstgrade: model specification and estimation. Sociology of Education 77 (1), 1–31.

Campbell, Donald T., 1979. Assessing the impact of planned social change. Evaluation and Program Planning 2, 67–90.Cancio, A. Silvia, Evans, T. David, Maume Jr., David J., 1996. Reconsidering the declining significance of race: racial differences in early career wages.

American Sociological Review 61 (4), 541–556.Card, David, Rothstein, Jesse, 2007. Racial segregation and the black–white test score gap. Journal of Public Economics 91 (11–12), 2158–2184.Carnoy, Martin, Loeb, Susanna, 2002. Does external accountability affect student outcomes? A Cross-state analysis. Educational Evaluation and Policy

Analysis 24 (4), 305–331.Chatterji, Aaron K., Toffel, Michael W., 2010. How firms respond to being rated. Strategic Management Journal 31 (9), 917–945.Cho, Rosa M., 2011. Effects of welfare reform policies on Mexican immigrants’ infant mortality rates. Social Science Research 40 (2), 641–653.Coleman, James S., Campbell, Ernest Q., Hobson, Carol J., McPartland, James, Mood, Alexander J., Weinfeld, Frederic D., York, Robert L., 1966. Equality of

Educational Opportunity. USGPO, Washington, DC.Condron, Dennis J., 2009. Social class, school and non-school environments, and black/white inequalities in children’s learning. American Sociological

Review 74 (5), 685–708.Cook, M., Evans, W., 2000. Families or schools? Explaining the convergence in white and black academic performance. Journal of Labor Economics 18 (4),

729–754.Cullen, J.B., Reback, R., 2006. Tinkering towards accolades: school gaming under a performance accountability system. In: Gronberg, T.J., Jansen, D.W. (Eds.),

Advances in Applied Microeconomics, vol. 14. Elsevier, Oxford, UK, pp. 1–34.Dee, Thomas S., Jacob, Brian, 2011. The impact of no child left behind on student achievement. Journal of Policy Analysis and Management 30 (3), 418–446.Dewan, Shaila, 2010. Georgia Schools Inquiry Finds Signs of Cheating. The New York Times (February 11).Diamond, John B., 2007. Where the rubber meets the road: rethinking the connection between high-stakes testing policy and classroom instruction.

Sociology of Education 80 (4), 285–313.Downey, Douglas B., von Hippel, Paul T., Broh, Beckett A., 2004. Are schools the great equalizer? Cognitive inequality during the summer months and the

school year. American Sociological Review 69 (5), 613–635.Duncan, Greg, Magnuson, Katherine, 2005. Can family socioeconomic resources account for racial and ethnic test score gaps? Future of Children 15 (1), 35–

54.Entwisle, Doris R., Alexander, Karl L., 1992. Summer setback: race, poverty, school composition and math achievement in the first two years of school.

American Sociological Review 57 (1), 72–84.Entwisle, Doris R., Alexander, Karl L., 1994. Winter setback: the racial composition of schools and learning to read. American Sociological Review 59 (3),

446–460.Espeland, Wendy N., Sauder, Michael, 2007. Rankings and reactivity: how public measures recreate social worlds. American Journal of Sociology 113 (1), 1–

40.Figlio, D.N., Getzer, L., 2006. Accountability, ability, and disability: gaming the system? In: Gronberg, T.J., Jansen, D.W. (Eds.), Advances in Applied

Microeconomics, vol. 14. Elsevier, Oxford, UK, pp. 35–50.Figlio, D.N., Rouse, C.E., Schlosser, A., 2009. Leaving No Child Behind: Two Paths to School Accountability’’. Working Paper.Fryer, Roland G., Levitt, Steven D., 2004. Understanding the black–white test score gap in the first two years of school. Review of Economics and Statistics 86

(2), 447–464.Grodsky, Eric, Pager, Devah, 2001. The structure of disadvantage: individual and occupational determinants of the black–white wage gap. American

Sociological Review 66 (4), 542–567.Hallett, Tim., 2010. The myth incarnate: recoupling processes, turmoil, and inhabited institutions in an urban elementary school. American Sociological

Review 75 (1), 52–74.Hanushek, Eric A., Raymond, Margaret E., 2005. Does school accountability lead to improved student performance? Journal of Policy Analysis and

Management 24 (2), 297–327.Hanushek, Eric.A., Rivkin, Steven.G., 2009. Harming the best: how schools affect the black–white achievement gap. Journal of Policy Analysis and

Management 28 (3), 366–393.Harris, Douglas, Herrington, Carolyn, 2004. Accountability and the Achievement Gap: Evidence from NAEP. Unpublished manuscript. Department of

Educational Leadership and Policy Studies, Florida State University.Hayward, Mark D., Crimmins, Eileen M., Miles, Toni P., Yang, Yu, 2000. The significance of socioeconomic status in explaining the racial gap in chronic health

conditions. American Sociological Review 65 (6), 910–930.Heyns, Barbara, 1978. Summer Learning and the Effects of Schooling. Academic Press, New York, NY.Imbens, Guido W., Wooldridge, Jeffrey M., 2009. Recent developments in the econometrics of program evaluation. Journal of Economic Literature 47 (1), 5–

86.Jacob, B.A., 2005. Accountability, incentives, and behavior: the impact of high-stakes testing in the Chicago public schools. Journal of Public Economics 89

(5–6), 761–796.Jacob, B.A., Levitt, S.D., 2003. Rotten apples: an investigation of the prevalence and predictors of teacher cheating. Quarterly Journal of Economics 118 (3),

843–877.Jennings, Jennifer L., 2010. School choice or schools’ choice? Managing in an era of accountability. Sociology of Education 83 (3), 227–247.Jennings, Jennifer L., Beveridge, Andrew A., 2009. How does test exemption affect schools’ and students’ academic performance? Educational Evaluation and

Policy Analysis 31 (2), 153–175.Kim, Chang Hwan, 2010. Decomposing the change in the wage gap between white and black men over time, 1980–2005: an extension of the blind-Oaxaca

decomposition method. Sociological Methods and Research 38 (4), 619–651.Kreisman, Daniel, 2012. The source of black–white inequality in early language acquisition: evidence from early head start. Social Science Research 41 (6),

1429–1450.Krieg, J.M., 2011. Which students are left behind? The racial impacts of the no child left behind act. Economics of Education Review 30 (4),

654–664.Ladd, Helen F., Lauen, Douglas L., 2010. Status versus growth: the distributional effects of accountability policies. Journal of Policy Analysis and Management

29 (3), 426–450.Lauen, Douglas L., Gaddis, S. Michael, 2012. Shining a light or fumbling in the dark? The effects of NCLB’s subgroup-specific accountability pressure on

student achievement. Educational Evaluation and Policy Analysis 34 (2), 185–208.Lauen, Douglas L., Gaddis, S. Michael, 2013. Exposure to classroom poverty and test score achievement: contextual effects or selection? American Journal of

Sociology 118 (4), 943–979.Lee, Jaekyung, 2006. Tracking achievement gaps and assessing the impact of NCLB on the gaps: an in-depth look into national and state reading and math

outcome trends. The Civil Rights Project of Harvard University, Cambridge, MA.Lee, Jaekyung, 2008. Is test-driven external accountability effective? Synthesizing the evidence from cross-state causal-comparative and correlational

studies. Review of Educational Research 78 (3), 608–644.

Page 17: School accountability and the black–white test score gap

S.M. Gaddis, D.L. Lauen / Social Science Research 44 (2014) 15–31 31

Lee, Jaekyung, Wong, Kenneth K., 2004. The impact of accountability on racial and socioeconomic equity: considering both school resources andachievement outcomes. American Educational Research Journal 41 (4), 797–832.

Lee, Valerie E., Burkam, David T., 2002. Inequality at the Starting Gate. Economic Policy Institute, Washington, DC.Logan, John R., Darrah, Jennifer, Oh, Sookhee, 2012. The impact of race and ethnicity, immigration, and political context on participation in American

electoral politics. Social Forces 90 (3), 993–1022.Magnuson, Katherine, Meyers, Marcia K., Ruhm, Christopher J., Waldfogel, Jane, 2004. Inequality in preschool education and school readiness. American

Educational Research Journal 41 (1), 115–157.Meyer, Bruce, 1995. Natural and quasi-experiments in economics. Journal of Business and Economic Statistics 12, 151–162.Monk, David H., 1981. Toward a multilevel perspective on the allocation of educational resources. Review of Educational Research 51 (2), 215–236.Neal, Derek, Schanzenbach, Diane W., 2010. Left behind by design: proficiency counts and test-based accountability. The Review of Economics and Statistics

92 (2), 263–283.Nichols, Sharon L., Berliner, David C., 2007. Collateral Damage: How High-Stakes Testing Corrupts America’s Schools. Harvard Education Press, Cambridge,

MA.No Child Left Behind Act of 2001. Pub. L. No. 107-110, 115 Stat. 1425, 20 U.S.C. 6301 et seq.Pampel, Fred C., 2009. The persistence of educational disparities in smoking. Social Problems 56 (3), 526–542.Pettit, Becky, Western, Bruce, 2004. Mass imprisonment and the life course: race and class inequality in U.S. incarceration. American Sociological Review 69

(2), 151–169.Phillips, Julie A., 2002. White, black, and Latino homicide rates: why the difference? Social Problems 49 (3), 349–373.Potter, Daniel, Roksa, Josipa, 2013. Accumulating advantages over time: family experiences and social class inequality in academic achievement. Social

Science Research 42 (4), 1018–1032.Rauscher, Emily, 2011. Producing adulthood: adolescent employment, fertility, and the life course. Social Science Research 40 (2), 552–571.Ravitch, Diane, 2010. The Death and Life of the Great American School System. Basic Books, New York, NY.Reardon, Sean F., 2011. The widening academic achievement gap between the rich and the poor: new evidence and possible explanations. In: Murnane, R.,

Duncan, G. (Eds.), Whither Opportunity? Rising Inequality and the Uncertain Life Chances of Low-income Children. Russell Sage Foundation Press, NewYork, NY.

Reardon, Sean F., 2013. The widening income achievement gap. Educational Leadership 70 (8), 10–16.Reardon, Sean F., Bischoff, Kendra, 2011. Income inequality and income segregation. American Journal of Sociology 116 (4), 1092–1153.Reback, Randall, 2008. Teaching to the rating: school accountability and the distribution of student achievement. Journal of Public Economics 92 (5–6),

1394–1415.Reback, Randall, Rockoff, Jonah, Schwartz, Heather L., 2011. Under Pressure: Job Security, Resource Allocation, and Productivity in Schools under NCLB’’.

Working Paper.Roscigno, Vincent J., 1998. Race and the reproduction of educational disadvantage. Social Forces 76 (3), 1033–1061.Roscigno, Vincent J., 2000. Family/school inequality and African–American/Hispanic achievement. Social Problems 47 (2), 266–290.Rothstein, Richard, Jacobsen, Rebecca, Wilder, Tamara, 2008. Grading Education: Getting Accountability Right. Economic Policy Institute and Teachers

College Press, Washington, DC and New York, NY.Rumberger, Russell W., Palardy, Gregory J., 2005. Does segregation still matter? The impact of student composition on academic achievement in high school.

Teachers College Record 107 (9), 1999–2045.Sauder, Michael, Espeland, Wendy N., 2009. The discipline of rankings: tight coupling and organizational change. American Sociological Review 74 (1), 63–

82.Shafer, Emily Fitzgibbons, Malhotra, Neil, 2011. The effect of a child’s sex on support for traditional gender roles. Social Forces 90 (1), 209–222.Spillane, James P., Parise, Leigh Mesler, Sherer, Jennifer Zoltners, 2011. Organizational routines as coupling mechanisms: policy, school administration, and

the technical core. American Educational Research Journal 48 (3), 586–619.Vanneman, Alan, Hamilton Linda, Anderson Janet Baldwin, Rahman Taslima, 2009. Achievement Gaps: How Black and White Students in Public Schools

Perform in Mathematics and Reading on the National Assessment of Educational Progress, (NCES 2009-455). National Center for Education Statistics,Institute of Education Sciences, U.S. Department of Education. Washington, DC.

Wakefield, Sara, Uggen, Christopher, 2010. Incarceration and stratification. Annual Review of Sociology 36, 387–406.Williams, David R., Collins, Chiquita, 1995. US socioeconomic and racial differences in health: patterns and explanations. Annual Review of Sociology 21,

349–386.Wong, Manyee, Cook, Thomas D., Steiner, Peter M., 2010. No Child Left Behind: An Interim Evaluation of its Effects on Learning Using Two Interrupted Time

Series Each with its Own Non-Equivalent Comparison Series. Working Paper.Yeung, Wei-Jun Jean, Conley, Dalton, 2008. Black–white achievement gap and family wealth. Child Development 79 (2), 303–324.Yeung, Wei-Jun Jean, Pfeiffer, Kathryn M., 2009. The black–white test score gap and early home environment. Social Science Research 38 (2), 412–437.