On the way to school

Today, secondary schooling is a natural pathway for generations of young people. It was not always so. Historically, education was an elite pursuit. And even as public education systems with compulsory primary schooling began to take root (for much of the developed world, in the nineteenth century), high school remained the domain of a relative few who would study at university.

In a world where access to education is restricted rather than open, where schooling is an option (perhaps only for some) rather than a necessity, it is relevant to consider the factors driving the pursuit of education. What determined who went to high school?

One candidate is location. In a recent working paper, Insa-Sánchez explores how the geographic distribution of people and high schools across nineteenth-century Spain influenced educational attainment.

As the map below illustrates, there were 61 high schools dotted around Spain in 1877-78. These schools were mostly located in the capital cities of each province. But the bulk of the Spanish population during the nineteenth century did not live in cities. Even in 1900, less than 20 per cent of Spaniards lived in provincial capitals. Those living outside the urban centres would thus typically have to relocate in order to attend high school: far from a costless endeavour. As Insa-Sánchez notes, it was not uncommon for young people in rural areas to work before later attending high school. Not only did people save up money to move, they might also interrupt their high school education to work, such that they could cover their living expenses.

The 61 high schools of Spain, 1877-78. Source: Figure A6, Insa-Sánchez (2021).

Against this backdrop, those already living in cities with high schools likely had a advantage. Without having to uproot their lives, they had a better chance of completing their schooling without disruption. This insight provides the basis for Insa-Sánchez’s analysis. He uses differences in the age of graduates to measure barriers to accessing secondary education. The later in life one graduates, the greater the hurdles one has faced in getting both to and through high school.

Big cities, bright minds

The data available for high school education in nineteenth-century Spain are relatively limited. The key exception is the 1877-78 school year, for which Insa-Sánchez draws on a complete set of graduation records, covering 2,908 students across Spain. While the focus on a single year is a potential drawback — one cannot know if the given year was for some reason atypical — the analysis offers at least an indication of the state of Spanish education at a point in time.

The data show that around 30 per cent of students graduated between the ages of 10 and 15, and almost 60 per cent graduated between the ages of 16 and 20 — the median age of graduates is 17. The remainder graduated in their twenties — or even older.

The geographic distribution of students in terms of their home towns is given by the map below. Around 30 per cent of students came from municipalities with populations over 50,000; almost 40 per cent from municipalities with populations between 5,000 and 50,000. All but two students were men. (Spanish women were only allowed to receive secondary education from 1871, and overall numbers were still low by 1877.)

Distribution of graduates across Spain, 1877-78

Distribution of high school graduates across Spain

The top five Spanish municipalities of origin for high school graduates are:

  • Green: Madrid, Barcelona
  • Yellow: Sevilla, Valencia, Zaragoza

Source: Figure 4.1, Insa-Sánchez (2021)

The question to consider is, what relationship — if any — exists between age of graduation and municipality of origin. Insa-Sánchez begins by reporting OLS estimates, where the outcome variable is students’ age of graduation, and the key explanatory variable is the population size of the students’ municipality of origin (measured in 1860, as an average approximation of students’ year of birth). The model additionally includes a quadratic term to reflect a diminishing marginal effect of population size. In the preferred specification, a range of additional controls are also included: distance between a student’s municipality of origin and their high school, a dummy for whether their municipality of origin is different from the municipality of their high school, and a count of how many other high schools students attended before graduating (‘intermediate schools’). Finally, the model accounts for location-based fixed effects.

The baseline results confirm a correlation between population size in students’ municipalities of origin: the larger the town, the lower the average graduation age. But OLS estimates on their own tell an incomplete story. Insa-Sánchez thus reports results across different ranges of the distribution of graduation ages: a quantile regression. The table below provides a comparison of the OLS and the quantile-regression estimates.

OLS and quantile regression results, distribution of graduation ages

Outcome variable: graduation ageOLS10th %ile25th %ile50th %ile75th %ile90th %ile
Population 1860-2.088***
(0.514)
-0.967*
(0.495)
-1.354***
(0.328)
-1.457***
(0.475)
-2.259**
(0.910)
-4.773**
(2.034)
(Population 1860)20.091***
(0.027)
0.044
(0.027)
0.061***
(0.017)
0.062**
(0.025)
0.098**
(0.047)
0.229**
(0.105)
Distance to school0.001
(0.001)
-0.000
(0.001)
-0.000
(0.000)
-0.001
(0.001)
0.001
(0.002)
0.004
(0.008)
Relocated for school? (dummy)0.226
(0.242)
-0.412**
(0.181)
-0.150
(0.153)
-0.098
(0.159)
0.439
(0.441)
1.677
(1.043)
1 intermediate school0.622***
(0.189)
0.454***
(0.162)
0.194*
(0.106)
0.248**
(0.124)
0.858***
(0.316)
1.056
(0.987)
2 intermediate schools0.703*
(0.382)
0.171
(0.234)
0.319
(0.383)
0.621*
(0.358)
0.812
(0.609)
-0.258
(2.361)
3 intermediate schools2.774*
(1.471)
0.154
(1.023)
1.022
(1.981)
1.621**
(0.700)
3.334***
(0.846)
10.276***
(2.294)
4 intermediate schools1.123
(2.731)
2.089***
(0.590)
0.720
(0.628)
6.077***
(1.025)
2.670**
(1.234)
-0.468
(1.801)
Constant26.794***
(2.655)
18.607***
(2.481)
22.777***
(1.558)
24.445***
(2.393)
26.391***
(4.704)
40.665***
(9.284)
Location fixed effectsYYYYYY
Adjusted R20.080
Pseudo R20.0460.0380.0600.0790.140
Standard errors (in parentheses) are robust and clustered at the municipality level. * p < 0.10, ** p < 0.05, *** p < 0.01.
Quantile regression percentiles (denoted in columns by “%ile”) by graduation age: youngest to oldest. Results for intermediate schools are reported relative to zero intermediate schools.
Source: Table 5.2, Insa-Sánchez (2021)

Whereas OLS results estimate the average (mean) effect of explanatory variables on the outcome variable, quantile regression allows for effects to be tested across different points along the outcome distribution. Evaluating at the 50th percentile thus translates to capturing the median effect. The difference between mean (column: “OLS”) and median (“50th %ile”) in this case stems from the long tail of late graduates — the distribution of ages is skewed.

While the overall trend from the OLS results passes through to the quantile-regression results, it is notable how much the magnitude of effects increases with graduation age (shown in the higher percentiles). In particular, the effect of population size in the municipality of origin increases markedly with age. It is these late graduates that, in effect, inflate the magnitudes reported in the OLS estimates.

The interpretation offered by Insa-Sánchez is that students who faced greater barriers to attending high school — where those barriers are proxied by the age of graduates — faced a greater ‘penalty’ from coming from a small municipality. Students from smaller municipalities faced above-average costs from pursuing education; students from large municipalities experienced substantially fewer difficulties in getting to high school.

While the story of higher costs for those from small towns is intuitively reasonable, less clear is what drives those costs. Is it, for example, a social status story? That is, were small rural areas poorer than large urban centres — and thus families in those rural areas were ill-resourced to send their children to high school? Might it have been some form of social discrimination against those from rural areas? Or is it a more anodyne explanation: for example, that transport options for those in rural areas were more limited, and thus more costly, than in larger municipalities?

Alas, the graduation data available do not readily allow for an examination of the underlying economic and social factors. But further exploration of these drivers would be welcome.

You’ve come a long way

One result left largely undiscussed in the paper is the insignificance and low magnitude of the coefficient estimates on the distance variable. What apparently matters for graduation age is not the distance between one’s home town and their high school, but the size of their home town. But the data only include those who graduated high school: it is plausible that distance matters for determining who goes — and who does not go — to high school; among those who go, it is the size of one’s home town (as a proxy for economic opportunity) that influences how long it takes to graduate.

Where’s the nearest high school? Source: Simo Räsänen / Wikimedia Commons.

The central conclusion of Insa-Sánchez’s work is that location matters for educational attainment. His results establish geographic inequality with respect to high school access across nineteenth-century Spain. The costs of that inequality are strongly evident in smaller municipalities, suggesting significant location-based barriers to accessing secondary education.

The origins of that geographic inequality are not fully defined, but a plausible channel is differences in economic opportunities. And to the extent that small rural areas are hampered by a relative lack of economic opportunity, then poor access to education only reinforces such disadvantage. Breaking that vicious cycle is not simply a task consigned to history, but something policymakers around the world still grapple with today. In that context, examining the causes and consequences of geographic inequality is a valuable field for research.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.