A useful starting point for assessing the effect of a shock or intervention is a basis of comparison: those affected by a change versus those not affected. But what if everyone is potentially affected in some way? What if there is no obvious benchmark against which differences can be evaluated?
One answer is simply to create your own comparator — stitching together a credible counterfactual using attributes from real-world observations. This synthetic control method is a relatively recent innovation in empirical analysis. It offers a smart tool for identifying casual effects — under the right conditions.
In a new paper, Gilchrist et al. (forthcoming) discuss synthetic controls and how they can be applied in studies of economic history. They also offer a practical demonstration by examining how the discovery of oil in Venezuela a century ago influenced the country’s long-term development.
Constructing a counterfactual
To understand how the synthetic control method works, it helps to consider standard methods of comparative analysis. For example, the basic idea of difference-in-difference analysis is to compare two groups: one affected or ‘treated’ by a shock, the other not. Prior to the shock or intervention, the two groups exhibit otherwise equivalent outcomes. Afterwards, outcomes in the ‘treatment group’ begin to diverge from the unaffected ‘control group’. The difference between the two is taken as the causal effect of the shock.
In broad terms, synthetic controls can replicate the effect of a conventional control group. The purpose is to establish a plausible baseline against which effects can be evaluated. The question is, how do you design an artificial baseline that is nevertheless realistic?
As Gilchrist et al. describe, construction of the counterfactual requires a sufficient range of possible control observations — such as individuals, businesses and countries, depending on what the treatment group is. These alternatives form the ‘donor pool’ from which the synthetic control can be compiled. Not all of these donors will necessarily enter the analysis. But the larger the donor pool, the greater the possibilities for finding a mix that matches the treatment group. This mix consists of different weights — percentage shares from 0 to 100 — being assigned to each of the donors. (If there is only one donor weighted as 100 per cent, then this is functionally the same as a difference-in-difference model.)
For the treatment group and donor pool alike, a sufficient range of pre-treatment observations is required. The goal in blending different donors is to find a composition that closely approximates the pre-treatment outcomes of the treatment group. That is, both treatment group and synthetic control group should follow the same path up to the shock or intervention.
There may also be underlying factors that relate to the observed outcomes, and how those outcomes might be affected by the shock or intervention. Thus, the overall profile of the synthetic control with respect to these underlying factors must also broadly reflect the profile of the treatment group. Put another way, the synthetic control group should be expected to respond in the same way as the treatment group when faced with the shock or intervention being examined.
The synthetic control method, much like difference-in-difference analysis, requires that the shock or intervention being examined is independent of the pre-treatment outcome. That is, the shock or intervention cannot itself be a consequence of the outcomes being observed. Similarly, the shock or intervention cannot be influenced by any of the members of the donor pool. As an example, Gilchrist et al. discuss the case of a country adopting democracy. To the extent that other countries might play a role in the treatment country’s democratisation (such as through military intervention or support for dissidents), then those countries should be excluded from the donor pool.
Running oil through the system
Gilchrist et al. use Venezuela as a case study for synthetic controls. They examine how the discovery of oil affected the country’s GDP growth. The discovery of oil in the early twentieth century was not sudden or unexpected: as the authors describe, the presence of oil reserves in Venezuela was first identified centuries earlier. But it was not until 1908 that the first concessions for exploration and production were granted. Merely comparing pre- and post-discovery outcomes within Venezuela may be biased, as the decision to mine oil could itself have been influenced by changes in economic circumstances that would (even in the absence of oil) have manifested themselves in changes to GDP.
The task here is to compare the economic effects of the discovery of oil with the counterfactual — what would have happened if oil had not been discovered in Venezuela. As we do not observe a Venezuela without oil, the authors set out to design one based on the attributes of comparable countries.
Synthetic Venezuela is constructed from a weighted aggregate of several countries: Myanmar contributes over half, Mexico a quarter, while Brazil, Chile, Jordan, Philippines and South Korea account for the remainder. This bundle of countries matches Venezuela’s growth trajectory prior to the discovery of oil, while also accounting for demographic, geographic and institutional attributes. (The authors also include a summary table of all the factors included in the model, providing a comparison between the real country and its artificial counterpart.)
The results of the analysis are depicted in the figure below. Until 1920, both real and synthetic Venezuela follow broadly the same path. There is a sharp break after 1920, where real Venezuela experiences a surge in growth, while synthetic Venezuela continues along a mostly unchanged path. This difference, according to the authors, is the effect of the discovery of oil.
The right-hand side of the figure provides another view of the story: how Venezuela’s GDP compares with 47 countries considered plausible candidates for the donor pool. While Venezuela starts in a relatively poor position, it rapidly rises towards the top of the group after the discovery of oil.
The results are robust to different specifications of the synthetic Venezuela: the results are not driven by the selected composition of the control group. Moreover, there is no evidence of a break in trend between the real and synthetic Venezuela at other times — the change occurs around 1920, with the discovery of oil offered as the leading explanation.
Are the results believable? Certainly, the immediate effect of oil on Venezuela’s growth is hard to refute. But the longer-term consequences are less certain. The difficulty is knowing how comparable the real Venezuela is to its counterfactual over time. How would the synthetic Venezuela have performed if it had discovered oil?
It is notable that synthetic Venezuela catches up to real Venezuela in the period after 1990. Corruption and debt characterised the (real) Venezuelan economy during the 1990s; the socialist revolutionary Hugo Chavez subsequently rose to power and systematically dismantled the country’s institutions. A combination of ruinous domestic policy and international sanctions have weakened Venezuela over the past two decades. (Gilchrist et al. discuss other literature applying the synthetic control method to evaluate Chavez’s rule.) Against that backdrop, it is unsurprising that GDP per capita in the real Venezuela has flatlined.
One could argue that these events are attributable to the discovery of oil; in the absence of oil, Venezuela’s development path would have been different. That is certainly plausible. But another possibility is that Venezuela’s political tragedy would have played out in much the same way regardless. We can’t know either way.
What the eye doesn’t see
The authors conclude that synthetic controls offer considerable possibilities for historical analysis. Any shock that results in a deviation from the long-term trend, whether transitory or permanent, can benefit from the method. The prerequisite is sufficient data, most obviously with respect to the ‘treated’ group, but also among the donor pool used to construct the counterfactual ‘control’ group.
There are plainly risks with synthetic control analysis. A poorly constructed counterfactual is almost guaranteed to lead to faulty conclusions about the effect being examined. Moreover, there is a narrative challenge for researchers employing such methods — the credibility of one’s results hinges on the validity of the synthetic control. Researchers must convincingly defend their constructed counterfactual.
Not that this is an unfamiliar concern in empirical analysis: any model is built on assumptions, which can (and should) be tested and queried. The difference here is that conventional techniques rely on data that record what has occured. By contrast, the synthetic control method invites us to test outcomes against what might otherwise have occured, derived from the weighted contributions of members of the donor pool.
This is not to suggest that the method is invalid or unreliable. Rather, it is a call for caution and scrutiny. Gilchrist et al. outline various options for testing the rigour of synthetic controls, which can do much to verify the merits of the methodology. But the essential nature of the counterfactual — the path not chosen — is that it is unobservable. The synthetic control method is not a window to an alternative universe; it can only provide an approximation of what might have been.