Michigan policymakers may also want to consider “career ladders,” which provide financial incentives for high-performing teachers to continue to work with students in the classroom and help other teachers with instruction. Because teaching has few possibilities for career advancement, highly motivated teachers seeking more responsibility and a better salary may move into administration or leave the profession altogether. Schools do need high-quality personnel in administration, but having good teachers routinely leaving the classroom in search of a greater challenge creates classroom vacancies that may be filled with lower-caliber personnel.
To address this problem, some policymakers have used career ladders, which can allow teachers to take on additional responsibilities, such as mentoring, for higher pay without having to abandon the classroom altogether. Although career ladders are theoretically a promising teacher quality reform, there is not a large body of research on how these programs affect student achievement.
One high-quality study by Thomas Dee of Swarthmore College and Benjamin Keys, a graduate student at the University of Michigan, does evaluate whether career ladders can raise student achievement.[175] Their analysis of the Tennessee Career Ladder Evaluation System occurred long after the termination of the program, but they were able to exploit the fact that the original experiment used a randomized design. Coincidentally, this career ladder system had been instituted at the same time and place as the Tennessee STAR class-size reduction program.
The Tennessee Career Ladder Evaluation System had five “rungs.” To advance up the ladder, teachers had to meet certain requirements, but in return, they were offered the chance to earn higher salaries. At the program’s inception, participation was voluntary for veteran teachers and required for new teachers, but after the first few years, participation became wholly voluntary. Nonetheless, reports showed that more than 90 percent of teachers chose to participate.
Graphic 12 below shows features of the career ladder program, which was in place for 13 years. As the figure shows, all new teachers had to start at Rung 1, but teachers who had already been teaching could be placed at an appropriate career level based on a performance evaluation. The dollar figures are from the 1980s and 1990s, so these rewards were worth more at that time.
Performance evaluations at Rungs 1 through 3 were conducted by local district personnel and were usually led by the building principal. For advancement to Rungs 4 and 5, teachers had to pass performance evaluations that were completed by independent evaluators from outside the teacher’s district. Dee and Keys report: “The evaluations that occurred at each stage of the career ladder assessed teachers on multiple ‘domains of competence’ using several distinct data sources (such as student and principal questionnaires, peer evaluations, a teacher’s portfolio, and a written test).”[176] Critics of the program asserted that promotion had become routine and not a reflection of merit, since 95 percent of participating teachers were successful at earning Level I (Rung 3) status. However, Dee and Keys point out that advancing to Levels II and III (Rungs 4 and 5) proved to be more difficult, as only 79 percent of teachers passed.
Because the career ladder program coincided with the class-size reduction project, Dee and Keys were able to take advantage of the random assignment of students and teachers to classrooms at the school level. Just as with the analysis of the STAR project, this randomization created relative equality among the classrooms in a given building. Because it is true that there was some diluting of the original randomization over time due to a number of factors, Dee and Keys statistically controlled for any systematic observable differences that may have entered into the sample. Although their adjustments could address any differences in students, Dee and Keys were still faced with a self-selection problem with teachers. In other words, if the career ladder program showed that teachers who participated were more effective at raising student achievement, the researchers could not determine whether their success was due to the program making them more effective or to the fact that those who chose to participate were simply different from — perhaps more motivated than — those who did not. Dee and Keys note, however, that if participating teachers were shown to be more successful, it would not matter whether the program was the cause or simply an indicator. At the very least, the program itself would be a success because it would have identified, promoted and rewarded better teachers.
Dee and Keys determined that participating teachers turned out to be more successful than nonparticipating teachers at raising student achievement. Specifically, they found that students of participating teachers scored approximately 3 percentile points higher in math. These students also scored higher in reading, but the differences were not quite statistically significant. Dee and Keys placed their findings in context when they reported: “The estimated gains associated with assignment to a career-ladder teacher equal 40 to 60 percent of the gains associated with assignment to a class with roughly 15 students rather than 22.”[177]
Dee and Keys then disaggregated the results for participating teachers into groups by career ladder level. They found that teachers on lower career ladder levels were responsible for the gains in math and that teachers on the higher career ladder levels were responsible for the gains in reading. Thus, even if participating teachers were more effective than nonparticipating teachers, the findings were not altogether uniform.
Dee and Keys’ experiment is one of the few that has measured a career ladder’s effect on student achievement, but it is not the only study of the career ladder programs. In 1994, Carolyn Horan and Vicki Lambert released an evaluation of the Utah Career Ladder Program, which had been adopted by the Utah Legislature a decade earlier.[178] The enabling legislation for the program allowed school districts to determine which components they would include in their local career ladder program. Some of the possible components were extra compensation for time spent on curriculum development, “inservice training, preparation, and related activities,” and “additional pay for additional performance.”[179]
Horan and Lambert surveyed principals and teachers to learn about their perceptions of the program and its individual components. The results were mixed. For example, while participants reported that they believed the program was having a positive impact on raising student achievement, they also felt that the performance bonuses were not administered fairly.
Susan Moore Johnson and colleagues at Harvard University summarized a number of qualitative studies of career ladders and also generally report that such programs have mixed results.[180] These collective research findings should indicate that reformers looking to institute a career ladder program need to be sensitive to teachers’ needs and preferences, since teachers’ buy-in is essential to any reform’s success. Policymakers interested in this reform should explore the Teacher Advancement Program models, which include a career ladder component and which currently operate in schools in more than a dozen states nationwide.[181]
In a comparison of 1,200 TAP and non-TAP schools from two states, Matthew Springer, Dale Ballou and Art Peng of Vanderbilt University found mixed results concerning the impact of TAP on student test scores.[182] Springer et al. found that TAP students in elementary grades two through five demonstrated significantly higher gains in math over the course of a given school year. However, the researchers also found that TAP had a negative effect in grades six, seven, nine and 10. Although Springer et al. posited two hypotheses for the apparently disparate impacts of TAP on student achievement in different grades, they are not convinced of these explanations. This study also has limitations. In addition to a small sample size of TAP schools, the study suffers from incomplete data on TAP implementation. Still, because other studies of TAP programs were conducted by researchers affiliated with the programs, this study improves on previous research. Moreover, the authors used a superior, complex statistical procedure to control for the “self-selection” possibility that schools that participated in TAP volunteered to do so because they were already predisposed to pursue higher student achievement gains.[183]