How old would you be if you didn’t know how old you were?

That people sometimes err when recalling their age is not uncommon. (I have even been guilty of it myself!) The problem is more acute in historical settings, and even in developing countries today: individuals with limited access to education are less able to calculate their age. An observed tendency is for innumerate people to round their age to the nearest five or ten. This gives rise to a phenomenon called age heaping, whereby data reveal sharp peaks of individuals aged (for example) 30, relative to either 29 or 31.

Age heaping is a type of measurement error, for which different adjustment techniques can be employed. But age heaping has also come to be used as evidence: in the absence of data on education levels, changes in the incidence of age heaping over time can be used to infer changes in education. The logic of this is attractive: all else being equal, a decline in age heaping suggests a rise in numeracy. But all else is not always equal.

In a new working paper, McLaughlin, Colvin and Henderson offer a warning on the perils of overinterpreting the significance of age heaping. Using census data from nineteenth-century Ireland, they show how age heaping can give an inaccurate picture of educational attainment.

Fives and tens

Ireland provides a strong example of age heaping. Based on census data from 1841 and 1871, the figure below illustrates a clear tendency among respondents to report ages in multiples of ten and — to a lesser extent — five. For both men and women, there are clear peaks at (especially) ages 30, 40, 50 and 60. This is not due to sudden baby booms every ten years.

Population distribution by age, Ireland

Population distribution by age, Ireland (1841 and 1871)
Source: Figure 1, McLaughlin, Colvin and Henderson (2022).

Though it is not immediately obvious from a visual inspection, age heaping evidently becomes worse moving from 1841 to 1871. Age heaping is measured with respect to the overall population size: for a given cohort, what percentage of reported ages are concentrated in ‘heaps’? Note that the number of young people — especially those under 30 — is considerably lower in 1871 compared to 1841.

The ‘puzzle’, as McLaughlin, Colvin and Henderson describe, is that over the same period, literacy levels in the Irish population are rising. This follows the introduction of state-funded public education from 1831, which focused on the core skills of reading, writing and arithmetic. If literacy rose due to increased access to schooling, then it is reasonable to suppose (unmeasured) arithmetic skills did as well.

There is prima facie evidence of a mismatch between the age heaping hypothesis (increased age heaping implies decreased educational attainment) and the direct measurement of literacy levels. But why?

Potato, potato

In the 1840’s, Ireland was struck by famine. Potato crops were blighted by disease; production of a staple in the Irish diet was crippled. The consequences for the Irish population were profound: widespread hunger resulted in deaths, particularly among children and the elderly. A large share of young people — of working age, but typically single — left the country to pursue a better life in (principally) Britain or the United States.

To understand the size of the effect, consider again the figure above. As noted, the number of young people declines between 1841 and 1871. But note also that the high number of young people in 1841 does not translate to a significant rise in the older population 30 years later. That is, young people who were counted in the 1841 were no longer in Ireland by the time of the 1871 census.

Hard to swallow. Source: US Agricultural Research Service / Wikimedia Commons.

McLaughlin, Colvin and Henderson posit that the demographic effects of the Irish famine skew the degree of age heaping observed in census data. The mass emigration of young people results in a remaining resident population that is older than it otherwise would have been. Moreover, because public schooling was not introduced until 1831, the elderly cohort will on average be less educated than younger groups.

Plainly, the elderly were also hard hit by the famine — many died. Their increased mortality may thus partially offset the effect of youth emigration on the overall population profile. (In the extreme, had more elderly people died than the number of young people who migrated, then the average population age would be pushed down rather than up. But this was not the case for Ireland.) McLaughlin, Colvin and Henderson test the effects of both migration and mortality on a leading index of age heaping, where a lower index score indicates greater heaping.

As the results below show, an increase in the population share of different age groups (up to 52 years old) between 1841 and 1871 is associated with less age heaping. Viewed the other way, when the share of younger people falls, then there is an increase in age heaping. This holds across different specifications of the model, though the magnitude of effect is greatly reduced once migration patterns are specifically controlled for (column 3). The effect of migration is statistically significant, while the effect of mortality is not. (Separately, the authors estimate the effect of the same variables on levels of illiteracy. In that case, it is mortality rather than migration which is the significant effect — the elderly being less literate and more likely to die during the famine.)

OLS results: exploring changes in age heaping across Ireland’s counties

Outcome: index of age heapingMean value
Std. dev.
(1)(2)(3)
Δ 23–32 age group share: 1841–71–4.36
2.64
1.390***
(0.479)
1.431***
(0.484)
0.597**
(0.253)
Δ 33–42 age group share: 1841–71–1.91
1.35
1.040*
(0.521)
1.144**
(0.534)
0.644**
(0.287)
Δ 43–52 age group share: 1841–710.88
1.39
2.527**
(1.029)
2.815**
(1.092)
1.252**
(0.477)
Δ Female:male ratio (ages 23–62): 1841–710.94
4.99
–0.101
(0.093)
–0.138*
(0.072)
Famine-era excess mortality21.73
15.74
-0.045
(0.034)
Famine-era migration144.09
76.90
–0.021***
(0.004)
Age heaping: 184173.42
4.02
0.023
(0.134)
0.034
(0.138)
–0.100
(0.136)
Observations (counties)32323232
R20.3380.3620.715
Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1. Δ denotes change over time. Age heaping measured using the ABCC index (A’Hearn, Baten and Crayen 2009), where age heaping is measured on a scale between 0 and 100, where 100 indicates no age heaping.
Source: Tables 2 and 3, McLaughlin, Colvin and Henderson (2022).

One possible wrinkle in the argument here is that the data on famine-era migration are estimates. These estimates are derived from Ó Gráda and O’Rourke (1997), who use pre-famine population growth rates (capped at 0.5 per cent per annum) to predict what the population of Ireland’s counties would have been in 1851 in the absence of the famine. From this predicted population, they substract the actual (lower) population in 1851, and adjust for deaths.

In short, rather than direct evidence, migration is calculated as a residual from a counterfactual. If the counterfactual is wrong (overstating what any county’s population would have been in the absence of the famine), or if the data on mortality are faulty, then the estimates of migration will be wide of the mark.

Nevertheless, even if the estimates of migration are off, this doesn’t change the key observation with respect to the overall change in Ireland’s demographic profile after the famine. It merely raises a question as to the magnitude of migration’s contribution to that change.

Rounding error? Source: normanack / Wikimedia Commons.

Know your population

The bottom line of McLaughlin, Colvin and Henderson’s paper is that any analysis based on age heaping needs to take account of underlying demographic trends. To this end, the authors propose a relatively simple adjustment to measures of age heaping.

Rather than comparing equivalent age groups at different points in time (for example, those born in one decade with those born in a different decade), they suggest considering the same age cohort at different points in time. That is, those who are in their twenties in 1841 with those in their fifties in 1871. The effect of this adjustment is to reduce the degree of age heaping in 1871 relative to 1841.

The authors liken this approach to considering differences in mortality rates between US states: many retirees move to Florida, for example, where they subsequently die. Unadjusted, this would point to higher mortality in Florida — ignoring the real story of changes in the demographic profile.

Notwithstanding their adjusted measure, McLaughlin, Colvin and Henderson conclude by arguing against the use of age heaping to infer changes in numeracy skills. Age heaping is first and foremost a type of measurement error. Understanding the possible reasons for that measurement error — and how it might change over time — provides an interesting enough basis for research without overinterpreting the effects to suggest something else.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.