When teacher quality is measured by the degree to which teachers affect student learning, as indicated by student test score gains and principal evaluations, other reforms become possible, such as compensation reform. Compensation plans that pay teachers differently for different levels of performance are commonly called "merit pay," "pay-for-performance," or "incentive-based pay" plans. There is no single merit-pay program; rather, the merit-pay programs that have been adopted across the country have various features. For example, some plans reward teachers only for increasing student test scores, while others include supervisor or peer evaluations as measures of teacher merit. Still others include the amount of professional development that teachers undergo as a sign of merit and a basis for pay. Researchers for Vanderbilt University’s National Center on Performance Incentives have reviewed various merit-pay plans and provided useful descriptions.
As we indicated in Part III, nearly all schools nationwide and in Michigan pay teachers according to the single-salary schedule. Unfortunately, the characteristics on which this pay schedule is based do not correlate with greater student achievement gains. Under the current pay system, teachers are encouraged to help their students learn primarily out of concern for the students and the intrinsic rewards of doing a good job. True, these are significant motives. Teachers tend to enter the profession because of their love of children and their desire to serve others. Still, it is only reasonable to recognize that they, like other people, are motivated to work partly by financial rewards and the recognition those rewards signify.
Without the possibility of earning more money for high-quality performance, teachers may be indirectly encouraged to meet only minimum performance levels, such as maintaining order in the classroom or keeping peace with parents. This outcome is even more likely when teachers are observed only a few times per year by supervisors, and their individual performance is not objectively measured by student test score performance gains. Single-salary-schedule compensation policies have ensured that teachers are paid the same amount whether their students improve or not, and across-the-board pay increases are often guaranteed simply for showing up each year. Unfortunately, as demonstrated in Part I, student achievement in Michigan is not high, and it has not improved compared to the national average despite high and rising state spending. In this context, alternative pay structures make sense. They reward the key people — effective teachers — who can improve public education in the state.
Some U.S. school districts have adopted pay-for-performance for teachers, yet only a few districts in Michigan have even begun to explore the possibility. The current discussion of merit pay is confused by historical debates over pay-for-performance programs. Teacher merit pay has been tried at various times over the past two centuries, and many of these experiments have ended in failure. As Allan Odden and colleagues from the University of Wisconsin found, the chief reason for these failures has been a loss of support from teachers and the public due to difficulties in understanding how payouts were calculated, to perceptions that bonuses were skewed by principal favoritism, and to a fear that teachers would be discouraged from collaborating. Still, over the last decade, policymakers in Denver, Florida, Minnesota, Little Rock and elsewhere have attempted to learn from merit pay’s past failures and designed plans that appear to be leading to student success.[*]
In order to evaluate incentive-based programs fairly, merit pay must be distinguished from other reforms, such as differential pay, career ladders and inauthentic performance-pay plans like "knowledge-and-skills" based pay, which provides extra compensation to teachers for participating in extra education and training. As University of Wisconsin researchers Herbert Heneman, Anthony Milanowski and Steven Kimball have written: "A teacher’s knowledge and skills are the basic inputs that a teacher brings to the instructional process. These skills include knowledge of content and pedagogy, skill in assessment and classroom management, and general abilities, attitudes, and personality dispositions." Knowledge-and-skills pay plans are often classified as performance pay because they provide financial incentives to teachers not only to improve student achievement, but also to participate in extensive professional development. By contrast, authentic merit pay primarily rewards student outcomes, not teacher inputs. To the extent that such plans focus on professional development, rather than student achievement gains, it is confusing and misleading to classify them as performance pay.
This distinction is important. In public discussion, knowledge-and-skills-based pay, which many teachers unions now claim to support, can easily be mistaken for genuine merit pay. For instance, in September 2007, the U.S. House Education and Labor Committee held hearings on the reauthorization of the No Child Left Behind Act. Reform proposals introduced by committee Chairman George Miller, a Democrat, had included merit-pay legislation that would have based teacher rewards for performance on student achievement test score gains. Education Week’s David Hoff reported that Rep. Miller was confused when, at the end of the hearings, "NEA President Reg Weaver and AFT Executive Vice President Antonia Cortese objected to proposed alternative pay programs for teachers, which are included in the section addressing teacher quality." Hoff noted that Rep. Miller reminded the union leaders that the language in the bill had been drafted based on prior conversations with the unions. Yet faced with a true merit-pay program, the AFT’s Cortese spoke up and said, "We do have specific concerns about a provision that would use test scores to evaluate teachers."
This exchange in Congress clarifies the characteristics of authentic merit pay. Reconfiguring the teacher salary system to allow principals to award higher salaries or higher raises based on student achievement gains would be a meaningful reform.[†]
Teachers may be uneasy about proposals to alter their base pay, however. Reform-oriented school boards may thus want to propose that teachers get the same base pay, but receive bonuses for their effectiveness in improving student test scores. Most of this section focuses on the use of performance bonuses.
Although recent research (discussed below) demonstrates that merit pay based solely on student achievement gains can improve student test scores, policymakers may want to include the use of supervisor evaluations in merit-pay proposals. Teachers concerned about the use of statistical models to determine their bonuses may be better disposed to a formula that includes a principal’s evaluation. A similarly helpful proposal might base merit pay partly on group performance — in other words, rewarding teachers based on how a team of teachers with whom they collaborate succeeds in improving student performance. This approach may be particularly appropriate with elementary and middle school teachers, since they often plan as a team.
How might these recommendations work in practice? One suggestion would be to divide a $10,000 annual maximum merit-pay bonus along these lines: 50 percent would correspond to an individual teacher’s students’ average achievement gains as determined through a value-added assessment; 30 percent would correspond to the average gains of the teacher’s team; and the final 20 percent would correspond to a supervisor evaluation.[‡] Individual school districts and schools considering merit pay should make their own determinations about the percentages of bonus pay that might used with these three categories.
The exercise should prove worthwhile because merit pay is a viable reform that can indeed lead to greater student achievement. Writing for the National Bureau of Economic Research, David Figlio and Lawrence Kenny used a national sample of longitudinal student data to estimate the impacts of a variety of performance-pay plans related to student achievement. For example, the most restrictive (and hypothetically most motivating) plans were those that "had at least one of the following indicators of high salary incentives: a) at least a 20 percent salary range, b) merit raises that are given to no more than 5 percent of the teachers, or c) merit bonuses that are received by no more than 7 percent of the teaching staff." Although not all of the student achievement gains associated with performance-pay plans in the study were particularly large (between 1.3 and 3.2 points), Figlio and Kenny found, "[T]he use of teacher salary incentives is associated with higher levels of student performance, all else equal."
Figlio and Kenny carefully noted that not all types of programs have this association, as merit-pay programs "that award bonuses to very large fractions of teachers are apparently not associated with student outcomes." These findings provide strong correlational evidence that teachers tend to act in ways that raise student achievement in schools where meaningful performance incentives exist. This study does have a notable limitation, however, since Figlio and Kenny were able to establish only a correlation between merit pay and student outcomes, not a clear causation.
Many research organizations are now releasing reports with recommendations regarding which features of merit-pay plans will most likely lead to success. Many of these suggestions address not only the key elements of successful reform programs, but also the need to involve teachers in the planning process and to avoid supplanting useful collaboration with counterproductive competition. Reports on merit pay often summarize existing research and make recommendations based on program evaluations of specific merit-pay experiments.
Two such program evaluations are the first- and second-year reports by University of Arkansas researchers of the Achievement Challenge Pilot Project, a merit-pay program in Little Rock. The ACPP bases awards solely on student achievement gains that occur during a single school year. The program has operated for three school years, beginning with one school in 2004-2005, adding a second in 2005-2006 and adding three more in 2006-2007. Thus, by the end of the program, five schools were part of the ACPP.
In January 2007, Marcus Winters, Gary Ritter, Joshua Barnett, Jay Greene and others at the University of Arkansas Department of Education Reform released their first report on the impacts of this program. In this report, they examined the effects of the program in the first two schools and concluded that the students improved by 7 percentile points on average on standardized test scores.
Graphic 11 below shows how bonuses were awarded in the Little Rock experiment for four of the schools in school year 2006-2007.[**] Teachers earned rewards based on the magnitude of their students’ gains, and other school personnel — including principals, aides and even custodians — earned awards based on schoolwide gains. For individual teachers whose students’ scores can be directly linked to them (such as a fourth-grade teacher in a self-contained class), the percent growth was calculated for each student by subtracting the prior year score from the current year score and then by dividing the difference by the prior year score. These calculations were completed using normal curve equivalent scores.[††] In determining the payout for these teachers, the average percentage of growth for all students in the class was calculated by adding the individual percentile growth scores and dividing that sum by the number of students. A teacher was then awarded the per-child dollar figure for that percentage range multiplied by the number of students in his or her class. For employees like principals, physical education teachers and music teachers, whose students’ scores could not be directly linked to them, the payout was a lump sum based on the average schoolwide percentage growth. Individual awards exceeded $8,000, and personnel in both of the schools included in the first-year report earned total bonuses of more than $200,000.
This first research report on the Little Rock program also included survey findings comparing the attitudes of participating teachers with those of teachers in control schools. Participating teachers reported no increase in counterproductive competition among teachers. In fact, these teachers rated the atmosphere of their schools more positively than those in control schools. Moreover, teachers in participating schools reported being less likely to find low-performing students burdensome.
Although the first-year report suggested that a merit-pay program could improve student performance, the analysis did have several limitations, including small sample sizes. The second-year report on the Little Rock program, however, expanded the sample size, provided a stronger control group for comparing teacher attitudes and substantiated the claims of the first-year report. In this report, conducted by a University of Arkansas research team that included Gary Ritter, Nate Jensen, Brent Riffel, Marcus Winters, Joshua Barnett, Jay Greene and Marc Holley (the author), teachers in participating schools were included and compared to teachers in nonparticipating schools across the district. Using appropriate statistical controls in their value-added model, they found that teachers in the merit-pay program were significantly more effective. Survey data from the expanded sample in the second-year report did not show the program having as positive an effect on teacher attitudes as in the first-year report, but the presence of merit pay did not appear to damage the school climate or lead to counterproductive competition.
Concerning merit-pay plans that target individual teachers, Charles Clotfelter and Helen Ladd wrote in 1996: "The limitations of such programs are well known: the lack of consensus about what makes for effective teaching; the fact that gains in student achievement often reflect not just the actions of an individual teacher but also the more general environment for learning in the school; and the growing recognition that rewarding individual teachers encourages them to compete with one another rather than to work cooperatively."
Little Rock ACPP teacher survey data suggests that merit pay for individual teachers does not necessarily degrade the school climate as Clotfelter and Ladd suggest. Moreover, as merit-pay programs have evolved, program developers have more often solicited teacher input and achieved greater "buy-in" from teachers at the outset. Also, the findings of Carolyn Horan and Vicki Lambert, researchers for the Beryl Buck Institute for Education, concerning a related system, the Utah Career Ladder Program, suggested that while some teachers saw increased competition among their peers, not all teachers viewed this change as negative. In addition, merit-pay program developers have learned from past failures and created award systems in which all participants can earn bonuses, rather than just the top few. Perhaps, as a result, recent programs have tended to lead to less divisiveness and competition.
On reflection, this result need not be surprising. While competitiveness under the wrong circumstances could damage team spirit, merit pay does provide incentives that motivate teachers and other building personnel to focus their time and effort on promoting student learning, thus emphasizing the goal that entices many educators to enter the profession in the first place. Moreover, because authentic merit-pay programs reward only teachers who actually produce results, such programs discourage the retention of teachers who are simply in the classroom to draw a paycheck, who cannot communicate effectively or who do not have the problem-solving ability to address students’ learning challenges. A system that pressures such teachers to improve or to leave may also help morale, since their presence can depress the spirits of dedicated personnel.
Further, merit pay has the potential to attract a different type of professional to the teaching work force. The current pay structure can have the unintended consequence of attracting risk-averse, lower-performing candidates to the profession. Merit pay, in contrast, promises higher compensation to teachers who may be more tolerant of risk — that is, to those who are willing to make their pay depend in part on their ability to help students learn. It may well benefit morale for teachers to see themselves as part of a more enterprising team.
Some may question whether a compositional change in the teaching work force resulting from the attraction of more risk-tolerant people would be a good thing. Yet risk-tolerance regarding a performance-based salary system is different from risk-tolerance involved in hazardous activities, such as sky diving, for example. Having high-performing, enterprising teachers who are adept at problem-solving and willing to work harder to promote student achievement may be exactly the outcome which policymakers should strive for.
Another potential concern can be raised concerning whether an excellent teacher with students in the top quartile can be compared directly with an excellent teacher with students in the bottom quartile. Might their average gains differ due to the students’ dissimilar skill levels and potential for improvement? It is true that there are challenges in identifying exactly what a one-point gain in student performance means at different parts of the performance scale. Nonetheless, there are ways to address this concern. For example, one improvement upon simply using percentile scores to make comparisons of student achievement gains at different performance levels is to use normal curve equivalents. Moreover, when scores are converted to the appropriate standardized scales, comparisons even across grade levels are possible. These conversions are possible using the state achievement tests currently in place in Michigan.
A related challenge involves whether it may be especially difficult to raise the achievement of the highest achievers. It is true, perhaps, that a "ceiling" effect may occur, so that students at the 90th percentile do not have much room to grow. Many high-achieving students may have already maximized their potential, and teachers may find it extremely difficult to raise those students’ average performance dramatically. Teachers, however, should be involved in designing performance-pay plans, and they may decide to approach this concern in a number of creative ways. For example, teachers in a given school may decide that teachers of certain classes should have their potential bonus tied more heavily to supervisor evaluations. Alternatively, as we note below, teachers may decide that instructors of advanced students should be rewarded in part simply for maintaining a high level of performance.
Clotfelter and Ladd suggest that ceiling effects (the fact that certain high-performing students have little room to improve) and scaling effects (differential rates of progress depending on whether a student’s past performance has been strong or weak) can bias value-added models in favor of low-performing or high-performing students. One way to address these issues is to reward teachers based both on growth and attainment. Practically speaking, this means that program designers may choose to reward teachers of students who have routinely scored above the 80th percentile merely for sustaining or slightly improving the original score.
As with many of the recommendations offered in this primer, adoption and implementation of a compensation system with a merit-pay component could occur either at the local district level or at the state level. Local schools and districts will want to include teachers in the design of a particular plan, however, so local districts, rather than the halls of Lansing, are probably better places to settle the details of how teachers would earn their payouts.[‡‡] There are myriad details involved, after all, and teachers should be involved in making these decisions, since the outcome can affect teachers’ willingness to support a merit-pay system.
Fortunately, merit-pay plans have begun to emerge in several locations across the country, so Michigan districts would have a choice of models to adapt to their specific setting. In many of the plans, only a small portion of the merit-pay bonus is based on individual classroom performance (as opposed to professional development or schoolwide achievement), but district leaders drawing on these models can shift the emphasis easily enough. Prominent merit-pay systems include those in Little Rock (see Graphic 11 above), Houston, Denver and New York City.[***] The latter two programs allow individual schools flexibility in the design of the merit-award systems, creating a multiplicity of programs about which it is difficult to generalize.
Part of the reason for the proliferation of teacher merit-pay plans is that the federal government made approximately $100 million available for its "Teacher Incentive Fund" in 2006. The TIF program was designed to promote teacher compensation systems that would use student performance as a part of the basis for teacher pay. This competitive federal grant program has supported a total of 34 performance-pay programs located in over 18 different states and Washington, D.C. Not all recipients of TIF funding were traditional public schools; the New Leaders Inc. charter school network received a grant of more than $20 million spread over five years for a performance-pay program. No Michigan schools participate in the TIF program, due partly to the fact that of the 143 applications submitted nationwide, only four came from Michigan, a state with 552 conventional school districts and more than 200 charter schools. Yet given the findings presented above, merit pay is an essential teacher quality reform that policymakers in Michigan should consider.
[*] For summaries of these plans or synopses and links, see Podgursky and Springer, “Teacher Performance Pay: A Review,” or “Reforming Teacher Pay” (Policy Innovation in Education Network, 2007), www.edpolicyinnovation.net/pie/template/topic.cfm?topic=24, (accessed May 18, 2008). The plans in these various locations may have some features of “knowledge and skills based pay,” but they also have a component that is based on student achievement gains.
[†] The next section, “Differential Pay,” describes how principals could also be given discretion to pay more for teachers in subject areas that are difficult to staff.
[‡] For teachers who do not have students whose test scores can be attributed primarily to them, such as resource teachers or music teachers, bonuses would be rewarded for schoolwide gains.
[**] The table represents payouts in the third year of the experiment; the payouts in the second year, at the time of the first-year report by Winters et al., were not substantially different.
[††] In the earlier section “Using Value-Added Assessment to Define Teacher Quality,” there was a reference to statistical methods that could filter out nonteaching factors that might lower (or raise) student test scores. These methods involve regression analysis, and they provide a sophisticated means of determining the impact a teacher has had on student achievement.
Readers familiar with regression analysis will recognize that this statistical method was not employed to determine the merit-pay bonuses in the Little Rock experiment (though regression analysis was indeed used by the University of Arkansas researchers to establish that the program had a statistically significant effect on test scores). While the Little Rock model forgoes some of the virtues of statistical regression, it avoids some of its drawbacks, particularly the problem of making the method of determining the payouts clear and accessible to everyone affected by the plan, including parents and the public. Nevertheless, teachers could certainly request the use of statistical regression to determine payouts if they became concerned that a particular payout program would otherwise fail to account for, say, their students’ socioeconomic disadvantages. In fact, the regression model could be developed in consultation with them and their representatives, so that they could assess the potential drawbacks of the model before adopting it.
[‡‡] Many performance-pay programs require a supermajority vote of the school staff for the school to participate. For example, in Chicago’s Recognizing Excellence in Academic Leadership program, which uses the Teacher Advancement Program model, participating schools were required to obtain a 75 percent majority vote before adopting merit pay. Including a teacher vote as a prerequisite for program implementation may contribute to the likelihood of a program’s adoption and ultimate success. See “Memorandum of Understanding Between the Chicago Board of Education and the Chicago Teachers Union, Local No. 1, AFT, AFL-CIO,” www.ctunet.com/quest_center/documents/REALAgreement1.3.08.doc (accessed June 25, 2008).
[***] For information about the Houston Independent School District merit-pay plan, see “Houston Independent School District Project SMART” (Center for Educator Compensation Reform, 2008), www.cecr.ed.gov/initiatives/profiles/projectSMART.cfm (accessed May 18, 2008). Regarding the Denver Public Schools Professional Compensation System for Teachers, see “Procomp” (Denver Public Schools, 2008), http://denverprocomp.org/ (accessed May 21, 2008).