Causal Machine Learning and the Resource Curse

Who does mining help? Heterogeneous treatment effects with Stata 19

+0.149mining lifts nighttime lights
+0.405high-price premium · non-linear
96.90institutions moderate mining

Carlos Mendez

Nagoya University (GSID)

June 11, 2026

The Tension

Act I

One average effect hides who mining actually helps

The resource curse asks whether mineral wealth raises or wrecks development. Most studies answer with a single average effect.

But Hodler, Lechner & Raschky (2023) showed the answer bends with institutions. For whom does mining pay off — and where does it just bring conflict?

Stronger institutions, weaker mining benefit — the slope is the whole story

GATEs for the NTL mining effect (1-0) by executive constraints. Effect falls from 0.275 at the weakest constraints to 0.051 at the strongest.

Where we’re going

  • The estimand: heterogeneous effects \(\tau(\mathbf{x})\), not one ATE
  • A simulated resource-curse panel with known ground truth
  • Stata 19’s cate: ATE, then GATEs, then per-district effects
  • The lesson: institutions moderate mining, but not prices

The Investigation

Act II

The estimand is a function of \(\mathbf{x}\), not a single number

\[\tau(\mathbf{x}) = E\{Y_i(1) - Y_i(0) \mid \mathbf{X}_i = \mathbf{x}\}\]

The CATE is the average effect for districts with covariate profile \(\mathbf{x}\). Where \(\tau(\mathbf{x})\) bends, mining helps some districts more than others.

A single ATE is just \(E\{\tau(\mathbf{X})\}\) — this function averaged over everyone, discarding exactly the heterogeneity we want.

A 3,000-row lab with ground-truth effects we can check against

  • 300 districts × 10 years (2003–2012), 8 fictional countries
  • Outcomes — log nighttime lights (development) and a conflict indicator
  • Treatment — 4 levels: no mining (0), low / medium / high price (1/2/3)
  • Moderators — executive constraints (1–6) and quality of government

Because the data-generating process is known, every estimate can be scored against its true value — a luxury real data never gives.

A 4-level treatment becomes six binary CATE comparisons

Finding 1 — Mining effect

  • 1 vs 0 · mining vs none
  • 2 vs 0 · medium-price mining
  • 3 vs 0 · high-price mining

Finding 2 — Price non-linearity

  • 2 vs 1 · medium vs low (small)
  • 3 vs 1 · high vs low (large)
  • 3 vs 2 · high vs medium

Stata’s cate needs a binary treatment, so we subset to two groups at a time and run a generalized random forest on each contrast.

Two estimators residualize the nuisance, then read off the signal

\[y = d\cdot\tau(\mathbf{x}) + g(\mathbf{x},\mathbf{w}) + \epsilon, \qquad d = f(\mathbf{x},\mathbf{w}) + u\]

PO (Partialing-Out)

  • residualize \(y\) and \(d\) on \(\mathbf{x},\mathbf{w}\)
  • regress residual on residual
  • robust near propensity 0/1

AIPW (Augmented IPW)

  • outcome model + propensity reweight
  • doubly robust: one model can be wrong
  • the conservative default here

Six lines estimate a heterogeneous mining effect in Stata 19

keep if treatment == 1 | treatment == 0
gen byte treat_1v0 = (treatment == 1)
global catevars exec_constraints quality_of_govt gdp_pc elevation /// moderators
cate aipw (ntl_log $catevars) (treat_1v0), ///
    controls(i.country_id i.year) ///
    rseed(12345) xfolds(5) omethod(rforest) tmethod(rforest)
categraph gateplot          // GATEs by subgroup
estat gatetest              // formal heterogeneity test

Raw means are biased: mining districts differ before any mine opens

Contrast Naive diff Ground truth
1 vs 0 0.109 0.25
3 vs 1 0.414 0.30
2 vs 1 0.099 0.05

Some confounders push the raw gap above the truth, some below — geography, institutions, and wealth all differ across mining status.

With cross-fit AIPW, mining raises nighttime lights by 0.149

+0.149

AIPW ATE, mining vs none (SE 0.011, \(p<0.001\)); PO gives +0.194 on the same contrast

Price effects don’t ramp — they jump only at the top

+0.405

High-vs-low price premium (AIPW 3-1, \(p<0.001\)); medium-vs-low is \(-0.011\), \(p=0.90\)

A random forest chooses controls — it cannot relax identification

Objection. Machine-learning the nuisance functions can’t conjure a causal effect from observational data.

Response. Correct. \(\tau(\mathbf{x})\) is identified only under conditional independence given \(\mathbf{X}\) (unconfoundedness) and adequate overlap.

The forest estimates \(g\) and \(f\) — it cannot rule out an omitted confounder. The 3-vs-2 contrast even failed on overlap: a feature of honest estimation.

The Resolution

Act III

Institutions moderate mining: the GATE test is decisive

96.90

\(\chi^2(5)\) for GATE equality by executive constraints (\(p<0.001\)) — effects are not homogeneous

A second institutional measure tells the identical story

GATEs for the NTL mining effect (1-0) by quality-of-government quartiles: 0.298 in the lowest quartile, 0.073 in the highest. \(\chi^2(3)=69.19\), \(p<0.001\).

Subpopulation ATEs make the moderation concrete: 0.297 vs 0.092

Districts ATE SE \(N\)
Weak institutions (exec \(\leq 2\)) 0.297 0.022 558
Strong institutions (exec \(\geq 4\)) 0.092 0.016 1,526

The mining effect is more than three times larger in weakly-governed districts — the same slope, now as a policy-ready contrast.

Prices behave oppositely — institutions do not bend the price premium

GATEs for the NTL price effect (3-1) by quality-of-government quartiles. No monotone slope; \(\chi^2(3)=5.81\), \(p=0.121\) — fails to reject equality.

And mining’s conflict effect is positive everywhere but flat across institutions

GATEs for the conflict mining effect (1-0) by executive constraints. All groups positive (0.033–0.106) but \(\chi^2(5)=5.00\), \(p=0.416\) — homogeneous.

Heterogeneity isn’t just group-level: every district gets its own effect

Distribution of individualized treatment effects \(\hat\tau(\mathbf{x}_i)\) for the NTL mining effect, centered near the 0.15 ATE with substantial spread. estat heterogeneity: \(\chi^2(1)=53.05\), \(p<0.001\).

The IATE function slopes down smoothly in institutional quality

IATE function for the NTL mining effect as executive constraints rise, all other covariates at reference. The continuous downward trend confirms the GATE bars.

Same data, same command, opposite conclusions about moderation

Mining effect

  • GATE slope: steep, monotone
  • \(\chi^2(5)=96.90\), \(p<0.001\)
  • weak inst. 0.297 · strong 0.092

Price premium

  • GATE slope: flat, noisy
  • \(\chi^2(3)=5.81\), \(p=0.121\)
  • no institutional gradient

Test for heterogeneity — don’t average it away.