Does war really hurt GDP? And why use Arellano-Bond GMM?
The naive cross-country regression of GDP growth on war indicators often returns a small or insignificant coefficient. The case-study record paints a much darker picture. The mismatch is statistical, not substantive: countries that fight wars are not a random subsample of the world, and the dependent variable is highly persistent. Dynamic panel GMM fixes both problems by first-differencing (removing fixed effects) and using deeper lags of the level variable as internal instruments for the differenced lagged dependent variable.
This app lets you turn the dials yourself. Across four tabs you will: visualise Nickell bias as the demeaning-induced correlation that shrinks with panel length T; simulate an AR(1) panel and compare static FE to Arellano-Bond GMM as T grows; explore the post's actual coefficient forest for the four nested models; and inspect AR(2) and Hansen J diagnostics alongside the institutional-mediation pattern.
The Nickell-bias intuition (animation)
Two estimators of the lagged-DV coefficient ρ are tracked as the panel length T grows. The orange curve is static fixed effects: it asymptotes to the true ρ from below, with bias of order −1/T. The teal curve is Arellano-Bond difference GMM: it hits the true ρ almost immediately and stays there. At short T, the gap between the two curves is the Nickell-bias penalty you pay for using `xtreg, fe` on a dynamic model.
Nickell Bias Simulator
Set T, ρ, and noise. Draw a fresh AR(1) panel. Watch fixed-effects ρ̂ pull toward zero as T shrinks, and watch Arellano-Bond ρ̂ stay close to the truth.
Forest Plot
Every coefficient from the post's four nested xtabond2 models with 95% CIs. Toggle which coefficients and which models to see; hover for SE and 95% CI.
Diagnostics & Mediation
AR(2) and Hansen J p-values for every model, plus the institutional-mediation pattern: the long-run War effect shrinks from −0.35 (Model 1) to −0.17 (Model 4) as controls enter.
Glossary (open a card if a term is unfamiliar)
Dynamic panel
y[i,t] = ρ y[i,t-1] + β x[i,t] + αᵢ + εᵢₜ. Captures inertia, but creates an identification problem under fixed effects.Nickell bias
First-differencing
Δy[i,t] = y[i,t] − y[i,t−1]. The country-specific fixed effect αᵢ vanishes by construction. The launching pad for difference GMM.Internal lag instruments
gmm(L.y, lag(2 6)).Arellano-Bond GMM
E[Z' Δε] = 0. Implemented in Stata as xtabond2 ... noleveleq.Long-run cumulative effect
nlcom using the delta method.AR(2) test
Hansen J test
Estimand framing
Nickell Bias Simulator — fixed effects vs Arellano-Bond on AR(1) panels
The data-generating process is a simple AR(1) panel with country fixed
effects: y[i,t] = ρ y[i,t-1] + αᵢ + εᵢₜ. Slide the panel
length T, the true persistence ρ, and the noise level,
then click the bar chart cells to redraw. The grey bar is OLS on the
pooled data (biased upward by α). The orange bar is within-FE (biased
downward — the Nickell bias). The teal bar is Arellano-Bond difference
GMM with internal lag instruments. Watch how the bias gap closes
only as T grows.
What to look for
- Drag T to 4 or 5. The orange FE bar drifts well below the true ρ. This is Nickell bias. At T = 13 (the post's panel length) FE underestimates ρ by roughly (1+ρ)/T ≈ 0.13 — far from negligible.
- Drag T up toward 30. The orange bar climbs toward the teal Arellano-Bond bar. Nickell bias is a short-panel problem; in long panels FE is fine.
- Set ρ ≈ 0.9. Even at T = 13 the FE bar undershoots noticeably. High persistence amplifies the bias.
- OLS (grey) sits well above the truth, regardless of T. The country fixed effects αᵢ are correlated with the lag, so pooled OLS is biased upward.
- Click "Reseed" to redraw with a new RNG seed. Arellano-Bond should stay close to the truth; FE should keep its short-panel bias.
The post's coefficient forest — interactively
These numbers come straight from the production Stata run captured in
regression_results.csv and longrun_effects.csv.
Toggle outcomes (coefficient rows) and methods (which of the four nested
models) to compare. Hover a point to see its standard error, 95% CI, and
the number of GMM-style instruments the model used.
What to look for
- The contemporaneous War row is the headline. All four models have point estimates between −0.16 and −0.24, all significantly below zero (t = −3.82 to −5.20). A Magnitude-7 war is a 16-24% hit to log GDP within the same five-year window.
- L1.War and L2.War cross zero in every model. War's damage is contemporaneous, not delayed. Toggle off "War (contemporaneous)" and "Long-run sum War" to see only the lagged Wars, then realise none of them is individually significant.
- The Long-run sum War row is the substantive headline. All four CIs exclude zero; the point estimate shrinks monotonically from −0.35 (Model 1, no controls) to −0.17 (Model 4, with both Economic and Political Freedom). Roughly half the long-run war penalty is mediated by institutions.
- Coup contemporaneously hurts too. Coup (contemporaneous) is robustly negative and significant in every model. But the cumulative Coup effect only crosses t = 2 once Political Freedom is included (Models 3 and 4).
- L.lnGDPpercapita ≈ 0.62-0.68. GDP per capita is highly persistent. That high ρ is exactly what makes the Nickell bias on a static-FE estimator severe and Arellano-Bond essential.
Coefficient rows
Models
Why is the long-run War effect different from the contemporaneous one?
The long-run cumulative effect is the sum β₀ + β₁ + β₂ of the
contemporaneous and two lagged War coefficients. Individually the lagged
coefficients are tiny and statistically zero; the sum
pools them with the contemporaneous coefficient. Because the lag-1 and
lag-2 coefficients are slightly negative on average, the sum is more
negative than the contemporaneous coefficient alone. Stata's
nlcom uses the delta method to get the SE of this sum
correctly, accounting for correlations between the three War
coefficients.
Connecting back to Tab 2
The lagged-GDP coefficient L.lnGDPpercapita ≈ 0.68 you see above is the empirical counterpart of the slider you played with in Tab 2. The simulated AR(1) panel with ρ = 0.7 is essentially a stylised version of this real data. The reason the analysis uses Arellano-Bond and not plain fixed effects is exactly the bias gap you watched open and close as T moved.
Diagnostics & the institutional-mediation pattern
Two diagnostic tests validate Arellano-Bond GMM in every model: the AR(2) test (null: no second-order serial correlation in differenced residuals) and the Hansen J test (null: instruments are orthogonal to the error). Below, the left chart plots both p-values across the four models with a 0.05 reference line; the right chart shows how the long-run War effect shrinks as institutional controls enter, which is the article's main mediation finding.
AR(2) & Hansen J p-values
Every bar should sit above the dashed 0.05 line for the specification to pass.
Long-run War coefficient by model
Sum of contemporaneous + L1 + L2 War coefficients, with 95% CIs.
What to look for
- Model 1's AR(2) p = 0.091. Borderline — above 0.05 but below 0.10. Models 2-4 are all p > 0.6, comfortable.
- All four Hansen J p-values are in [0.13, 0.61]. None too low; none suspiciously near 1. The instrument set is plausibly orthogonal to the error in every model.
- The right-hand bars shrink monotonically from −0.35 to −0.17. That is the institutional-mediation pattern: each added control absorbs part of the war penalty. The shrinkage between Model 1 and Model 4 is 53% — roughly half the long-run penalty is mediated through institutions.
- Economic Freedom does the heavy lifting. Compare Model 2 (only Economic) to Model 3 (only Political): Model 2's long-run War shrinks more (−0.27) than Model 3's (−0.22). The Political Freedom coefficient itself is not significant in any model.
Why Hansen J near 1 is a red flag
Counter-intuitively, a Hansen J p-value very close to 1 is worse
than a moderate one. Roodman (2009) documents this: when the instrument
count is too large relative to N, the matrix of overidentifying
restrictions becomes near-singular, the test loses power, and the
p-value artifactually approaches 1. The post's choice of
lag(2 6) deliberately keeps the lag depth bounded, which is
why the J p-values stay in the moderate 0.13-0.61 range rather than
drifting up to 1.