How do you measure uncertainty when there is only one treated unit?
West Germany was reunified once, in 1990. There is no untreated twin to compare it against. The synthetic control method builds an artificial twin from a weighted blend of donor countries, then reads off the post-1990 gap. The SCPI framework of Cattaneo, Feng & Titiunik (2021) extends this with prediction intervals: a formal way to ask whether that gap is real or just noise.
This app lets you turn the dials. In four tabs you will: watch a synthetic West Germany self-assemble from 6 donor countries; widen or shrink the prediction-interval band and see how the “outside the band” years change; explore a simulated gap with an adjustable PI; and finally compare four weight-constraint methods on the actual reunification data.
The donor blend — animated
The simplex synthetic West Germany is a weighted average of 6 of the 16 donor countries. The bars below animate the weights filling in (highlighted bar = Austria, the largest contributor). Notice how concentrated the blend is — just two donors (Austria and the USA) account for 56% of the weight.
Donor Pool
Plot the actual West Germany alongside its synthetic counterfactual and a 95% prediction-interval band. Drag the slider to widen the band and watch how many years remain “outside”.
PI Simulator
The treatment gap on its own, with a symmetric PI band around zero. Toggle the “PI width multiplier” to see why even 99% intervals fail to cover the late-1990s gap.
Method Forest
The four weighting methods from §9 (Simplex, Lasso, Ridge, OLS) as a forest plot. Hover any point for the SE and CI; compare pre-RMSE versus the gap each method estimates.
Three takeaways the app is built around
- The gap is large and growing. By 2003 West Germany's GDP per capita was about −$3,465 below the synthetic counterfactual — roughly 11% lower than what the simulator predicts in the absence of reunification.
- The gap is statistically significant. From 1997 onward, actual GDP falls below the lower bound of the 95% prediction interval. Even at the 99% confidence level, the actual GDP falls outside the band for 7 of 13 post-treatment years.
- The synthetic is sparse and interpretable. Only 6 of 16 donors get non-zero weight: Austria 0.291, USA 0.273, Italy 0.191, Netherlands 0.133, Switzerland 0.081, France 0.030.
Glossary (open a card if a term is unfamiliar)
Synthetic control
Donor pool
Simplex constraint
Pre-treatment RMSE
Treatment gap
Prediction interval (PI)
In-sample uncertainty
Out-of-sample uncertainty
Actual vs Synthetic West Germany — with a prediction interval band
The orange line is West Germany's actual GDP per capita; the dashed steel line is the synthetic counterfactual built from the 6 donor weights. The shaded band is the prediction interval. Drag the slider to widen or shrink the band — orange-highlighted dots mark years where the actual line falls outside the band (i.e., a statistically significant gap).
scpi() (simplex, HC1, gaussian). Drag right to mimic a higher confidence level; drag left to see when years start falling outside the band.What to look for
- The two lines coincide before 1991. That is not a coincidence — the simplex weights were chosen to minimise pre-treatment RMSE (= 0.072 thousand USD, about 0.3% of West Germany's pre-1990 GDP).
- At 1.0 the band starts catching the actual line from 1997 onward. Earlier years sit inside the PI — the effect is emerging, but the in-sample uncertainty is large enough to absorb it. Late years sit clearly outside.
- Shrink the band to 0.5. Almost every post-1992 year becomes “significant”. Stretch to 2.0 and only the deepest 1999–2003 years stay outside.
- The band widens over time even at 1.0. That is the out-of-sample component: forecasting noise compounds with the distance from the pre-treatment window.
The treatment gap, isolated
Subtracting the synthetic line from the actual line gives the year-by-year gap — the SCPI point estimate of the causal effect. The band shows the prediction interval translated to the gap scale: any year where the gap crosses outside the shaded region is statistically distinguishable from zero. Move the multiplier to mimic different confidence levels.
Why the late years matter most
- The gap is monotone after 1992. A short-lived disturbance would generate a transient gap. A persistent gap with the same sign across thirteen years is the signature of a structural effect.
- The PI band is widest where the gap is largest. Out-of-sample uncertainty accumulates over time — yet even with that increasing width, the gap still falls outside at the end. That is the §10 sensitivity result made visible.
- Try multiplier = 1.30. Roughly the 99% PI in the post (avg width 3.30 vs 2.84 for 95%). You should still see 7 of 13 years outside the band.
Sensitivity table (from §10 of the post)
| Confidence | α (per side) | Avg PI width | Years outside |
|---|---|---|---|
| 99% | 0.01 | 3.298 | 7 / 13 |
| 95% | 0.05 | 2.842 | 7 / 13 |
| 90% | 0.10 | 2.583 | 9 / 13 |
| 80% | 0.20 | 2.304 | 9 / 13 |
Robustness across weight constraints
Section 9 of the post compares four weighting methods — Simplex, Lasso, Ridge, and OLS. The forest plot below shows the estimated gap in 2003 and the average post-treatment gap under each method, with approximate 95% confidence intervals. All four methods agree that the gap is negative; they differ only in magnitude.
Outcomes
Methods
What to look for
- Simplex and Lasso are nearly identical (gap 2003: −3.465 vs −3.426). Lasso is a relaxation of Simplex; on this data the relaxation barely kicks in.
- Ridge and OLS estimate a smaller gap (−2.72 and −2.38). Their pre-RMSE is lower (0.04 vs 0.07) — they fit the pre-treatment period better, but at the cost of using all 16 donors with possibly negative weights. Over-fitting the pre-period compresses the post-treatment divergence.
- None of the 95% CIs cross zero, on either outcome. The qualitative conclusion (reunification reduced West German GDP) is robust to the choice of constraint.
Connecting back to Tab 2
The PI band you played with on Tab 2 used the simplex constraint (the leftmost bar in the forest plot above). Ridge and OLS would shrink the gap by about 20–30%, but their pre-fit is also tighter — so their PI bands would be narrower too. Whether you read the gap as “about −$3,500” (Simplex/Lasso) or “about −$2,500” (Ridge/OLS), it is in the same order of magnitude and survives uncertainty quantification.