Specifying the Control Variables

Putting all of these variables together, the model of Michigan district spending looks like this:

Current Per-Pupil Spending =
      b₀ + â * {f₁(Size), f₂(FedRevenuesPP),
      f₃(StateLunchRevPP), f₄(PctSpecialEd), f₅(PctWhite),
      f₆(IncomePerCapita), f₇(EnrollmentByPopulation),
      f₈(IncomePerPupil), f₉(MedianHouseAskingPrice),
      f₁₀(MedianRentAsked), [Years]} + error

In the equation above, b₀ is the y-intercept, â is an array of constants for the explanatory variables, [Years] is an array of dummy variables (see the discussion of period effects, above), and error represents any unexplained variation in district spending caused by factors that we have not thought to, or been able to, measure.

Note that instead of simply including the variables of interest as linear terms in the equation, this study included functions (f1 through f10) of those variables. That is because these terms have been identified as plausible predictors of district spending per pupil, but it has not been established that their relationship to spending is necessarily linear, and Ordinary Least Squares regression assumes a linear relationship between the predictors (independent variables) and the dependent variable. So, using a combination of regression diagnostics, such as the Ramsay RESET test[16] and scatter diagrams of the dependent variable against the predictors, the author has identified the functions of these variables that most effectively and linearly predict district spending, while also conforming to a sound theoretical rationale for their inclusion in the model. Those functions are described in the paragraphs that follow.

Aggregate household income per capita is the only purely linear term, and it is positive as expected.

The percentage of special education students is a positive logarithmic term. Increases in the percentage of disabled children are associated with increased spending, but as a district’s percentage of special education students continues to rise, the marginal effect on spending of such additional increases gradually diminishes.

The percentage of white students in a district is a negative logarithmic term, meaning that districts with a higher share of white students spend less, other things being equal, but that the marginal reduction in spending diminishes as the share of white students becomes large. This may reflect the fact that overwhelmingly white districts may be predominantly located in smaller Michigan towns with lower living costs and hence lower labor costs not fully captured by the house and rental asking price controls.

The federal-revenues-per-pupil term was found to have both a logarithmic and a (small) quadratic component, both of which are positive. This connotes a monotonically increasing function (as expected) that first rises rapidly, plateaus slightly and then begins rising more rapidly once again.

The median house asking price is a quadratic term with a negative linear component and a positive squared component. This U-shaped curve suggests a more nuanced relationship than expected, but one that is theoretically reasonable. It is consistent with the expectation that districts with high housing prices pay their teachers more and hence have higher costs and higher spending, other things being equal. But it also suggests that districts with very low housing prices have above-average costs as well. The latter could be due to the greater difficulty of attracting teachers to work in economically depressed areas, with higher salaries being necessary to entice them to do so.

The state lunch revenue control is a quadratic term with a positive linear component and a negative squared component. Hence, up to a certain threshold, higher state school lunch revenue per pupil is associated with increased district spending but, beyond that threshold, the relationship is reversed. A plausible explanation for this pattern is that the costs associated with feeding and educating students who qualify for the state lunch aid grow at a faster rate than does the state aid itself.

Three of the control variables in the model have only squared terms: public school enrollment as a share of total district population, which has an unexpectedly negative coefficient, and aggregate household income per pupil and median residential rental asking price, which are positive as expected. The fact that public school enrollment as a share of population is negative means that districts whose populations presumably have more to gain from higher public school spending actually spend less, other things being equal. This is perhaps because a higher share of the population in public schools means more students across whom taxpayers’ dollars must be spread, so holding average income constant, it would be harder for officials to raise per-pupil spending in these districts.

Specifying the Control Variables

Contents