When conducting a pooled regression on data for multiple years, it is important to control for the possibility that some unknown factor may have increased district costs at a certain point in time. If, for example, the state had issued new curriculum guidelines in year three of the five-year period under investigation, all districts would likely have experienced higher textbook costs in year four. Gradual changes over time, such as aging of the population and monetary inflation, could also skew the results.

This kind of event is known as a period effect, and it could potentially skew predictions about the relationship between district size and spending. To control for period effects, the model includes a separate dummy variable (a variable whose value is either 0 or 1) for the last four[*] of the five years for which we have data. Doing so allows us to capture the possible impact of any such period effects without having to identify their causes.

[*] It is only necessary to have four dummy variables to control for five time periods because the fifth time period is captured by the four dummies all taking on the value 0. In other words, if an event didn’t occur in years one through four, it must, by process of elimination, have occurred in year five.