Regional Inequality and the Kuznets Curve — Interactive Lab

A pedagogical companion to Regional Inequality and the Kuznets Curve: Panel Fixed Effects in Python ↗ Back to the post

Why fixed effects? Why the N-shape?

Does economic growth reduce regional inequality? Simon Kuznets predicted an inverted-U: inequality rises with early industrialisation, then falls. Lessmann and Seidel (2017) used satellite nighttime-light data from 180 countries (1992-2012) and found something different: the relationship is N-shaped. Inequality rises, falls, then rises again at very high incomes.

This app turns the post's three central ideas into knobs you can move. Sweep panel sample size and country heterogeneity to see how pooled OLS drifts away from the truth while TWFE stays close. Adjust the three cubic coefficients yourself and watch where the curve bends. Toggle determinants to see which controls move the headline Kuznets coefficients — and which leave them alone.

Pooled vs within: the same data, two stories

The animation below shows 6 countries (coloured lines) moving through time on a development-inequality plot. The grey dashed curve is what pooled OLS sees — the cross-sectional pattern across all countries. Notice how each individual country's trajectory has its own level. Pooled OLS confounds between-country heterogeneity with the within-country Kuznets relationship. Two-way fixed effects strips the country-specific levels away.

Each coloured line follows one country over 5 periods. The dashed grey curve is the pooled OLS fit through every country-period dot. Within-country slopes (the colored segments) differ from the pooled slope — that gap is omitted-variable bias.

Tab 2

Panel FE Simulator

Simulate a panel with country fixed effects. Compare pooled OLS to TWFE estimates as you change sample size, heterogeneity, and signal. Run 100 simulations to see the bias-variance picture.

Tab 3

Turning Points

Adjust the three cubic coefficients $\beta_1, \beta_2, \beta_3$ and watch the N-shaped curve bend. Find the two turning points yourself — and see how the post's coefficients put them at \$2,287 and \$77,205.

Tab 4

Coefficient Stability

The post's robustness check, interactively. Toggle the 7 specifications and 3 polynomial terms to see when the N-shape holds — and watch the linear term drop from 0.293 to 0.149 once ethnic Gini is included.

Glossary (open a card if a term is unfamiliar)

Kuznets curve
The theoretical inverted-U relationship between development and inequality. Inequality rises with industrialisation, then falls. The null this post tests.
N-shaped relationship
A non-monotonic pattern with two turning points: rise, fall, rise. Captured by a cubic polynomial in log GDP.
Two-way fixed effects (TWFE)
A panel estimator that absorbs both country and period intercepts. Identification uses only within-country, within-period variation.
Pooled OLS
Treats every country-period observation as an independent draw, ignoring the panel structure. Confounds between-country and within-country variation.
Turning points
Income levels where ∂Gini/∂lnY = 0 — the slope of inequality with respect to income changes sign. Solve 3β₃x² + 2β₂x + β₁ = 0.
Within R²
The share of within-country, within-period variation in Gini that the polynomial explains. The honest fit number; overall R² is inflated by mechanical FE absorption.
Omitted variable bias (OVB)
Bias from a confounder that affects both lnY and inequality. Pooled OLS suffers; TWFE removes any time-invariant confounder.
Clustered standard errors
Standard errors that allow within-country correlation across periods. Implemented in PyFixest as vcov={"CRV1": "id"}.

Panel FE Simulator — pooled OLS vs TWFE

Simulate a panel with $N$ countries observed over $T$ periods. Each country has its own intercept (a country fixed effect), and the true relationship between log GDP and Gini is the post's cubic with $\beta_1 = 0.293$, $\beta_2 = -0.032$, $\beta_3 = 0.001$. Slide country heterogeneity up and watch pooled OLS drift away from the truth while TWFE stays put.

More countries ⇒ more cross-sectional variation; sharper estimates.
Post uses 5 periods (1990-2013). More T ⇒ richer within-country variation.
Spread of country intercepts. 0 = identical countries, 1.5 = wildly different baselines.
Idiosyncratic noise on Gini. Post's residual SD ≈ 0.025.

Pooled OLS

Ignores country structure. Each row is an independent observation.

β̂₁ (log GDP)
β̂₂ (log GDP²)
β̂₃ (log GDP³)

TWFE

Country + period demeaning. Within-only variation.

β̂₁ (log GDP)
β̂₂ (log GDP²)
β̂₃ (log GDP³)
Within R²

What to look for

  • Slide country heterogeneity up. Pooled OLS coefficients drift away from the true (0.293, -0.032, 0.001) while TWFE estimates stay close. That gap is omitted variable bias.
  • Slide periods T down to 3. TWFE within-R² collapses because there is less within-country variation to identify the cubic. More T ⇒ better identification.
  • True coefficients are shown as dashed lines in the chart. Pooled OLS scatters around the wrong values; TWFE clusters near the truth.

Bias vs variance across 100 simulations

One draw is noisy. Run the full panel simulation 100 times (same parameters, fresh shocks) to see whether pooled OLS bias is systematic and how variance compares.

Turning Points — bend the cubic yourself

The Kuznets curve's shape depends on three coefficients. Linear $\beta_1$ controls overall slope. Quadratic $\beta_2$ adds one bend (an inverted-U if negative). Cubic $\beta_3$ adds a second bend (an N if positive). Slide each coefficient and find the two income levels where the curve switches direction. The post's estimates put them at \$2,287 and \$77,205.

Post estimate: 0.293. Larger ⇒ steeper initial rise.
Post estimate: -0.0320. Negative ⇒ curve bends down.
Post estimate: 0.00112. Positive ⇒ second upturn at high incomes.
Vertical position of the curve. Cosmetic — does not affect turning points.
Turning point 1 (peak)
log GDP =
Turning point 2 (trough)
log GDP =
Curve shape
based on discriminant
Post estimates
\$2,287 / \$77,205
target you can hit

What to look for

  • Set β₃ = 0. The cubic term disappears. The curve becomes a quadratic with at most one turn — the classic Kuznets inverted-U. Only one turning point survives.
  • Slide β₂ toward zero. The two turning points spread apart and may move outside the data range (log GDP between 5.25 and 11.67). The N-shape becomes weaker.
  • Hit the post's targets. With β₁ ≈ 0.293, β₂ ≈ -0.032, β₃ ≈ 0.0011 the turning points sit at \$2,287 and \$77,205 — the values in §8 of the post.
  • Discriminant matters. When (2β₂)² − 12β₁β₃ < 0, the cubic has no real turning points — it is monotonic. The N requires the discriminant to be positive.

Coefficient stability across specifications

Does the N-shape survive when we add controls? The post estimates seven TWFE specifications: the baseline Kuznets cubic, the pooled OLS version, and five determinant models adding resources, trade, mobility, aid/education, or ethnicity. Toggle the specifications and polynomial terms to see when the N-shape holds — and watch the linear coefficient halve from 0.293 to 0.149 once ethnic Gini enters.

What to look for

  • Toggle off "+ Ethnicity" and the remaining six specifications cluster between 0.17 and 0.35 for β₁. Switch it back on and the linear coefficient drops to 0.149 — ethnic Gini absorbs almost half the apparent Kuznets effect.
  • Hover the cubic-term row. The "+ Aid/Educ" specification loses significance (95% CI crosses zero) because the sample drops to N = 585 from the missing-data cull.
  • Sign pattern (+, -, +) is preserved across all six TWFE rows. The N-shape is robust — only the magnitude moves with controls.

Polynomial terms

Specifications

Determinant effects (Table 4)

Nine candidate determinants of regional inequality, each estimated in a TWFE specification that controls for the Kuznets cubic. Ethnic Gini is the single strongest driver — 3.9 times larger than the next biggest positive effect (resource rents). Solid bars are significant at p < 0.10; faded bars are not.

Connecting back to Tab 3

The headline N-shape in Tab 3 came from $\beta_1 = 0.293$, $\beta_2 = -0.032$, $\beta_3 = 0.001$ — the TWFE baseline row in the forest plot above. The "+ Ethnicity" row gives a flatter N with $\beta_1 = 0.149$, so its turning points sit further apart. The shape survives — but the magnitude of the apparent development-inequality link depends on which confounders you control.

The §11 takeaway in the post is therefore visible here twice: once as the sign-stability across all six TWFE rows (the N is robust), and once as the magnitude attenuation when ethnic Gini enters (development is partly proxying for ethnic income gaps).