Introduction to Panel Data Methods in Python

What does panel data buy you?

Pooled OLS on a worker wage panel says joining a union raises wages by roughly 7%. The within (fixed-effects) estimator on the very same data says about 21% — almost three times larger. That factor-of-three gap is the empirical signature of selection on unobservables, and panel methods exist to peel it apart.

This app turns the post's three central ideas into knobs you can move. Watch the within transformation demean three workers in real time. Sweep panel sample size and unit heterogeneity to see how pooled OLS drifts from the truth while fixed effects stays close. Toggle the seven canonical estimators in the forest plot. And drive the Hausman statistic yourself by changing the FE/RE coefficient gap.

The within transformation, animated

Three workers, two periods each. Alice (steel) earns high wages and never joins the union. Bob (teal) switches into the union between periods. Carla (orange) earns low wages and is always in the union. The dashed grey line is the pooled OLS fit — shallow, because between-worker differences dominate. Wait a few seconds: the points slide as each worker's two-year mean is subtracted. After demeaning, only Bob moves — and the FE slope (orange) is much steeper.

The animation cycles between raw and demeaned coordinates on a 10-second loop. Demeaning kills two of three workers (Alice and Carla collapse to the origin) because they never switched. Only switchers identify the FE coefficient — that is the post's central asymmetry.

Between vs within: how much variation does FE have to work with?

For each variable in the post's wage panel (N = 2,199 workers; T = 2), we decompose total variance into a between part (variation across workers) and a within part (variation over time inside a worker). FE only uses the within slice.

Schooling has zero within-variation in this two-period sample — no worker's education changes between 2010 and 2012, so FE mechanically drops schooling. Union has only 6% within-variation: that thin slice is what FE uses to identify the union premium. Less data, different (and arguably cleaner) parameter.

Tab 2

Panel DGP Simulator

Simulate a panel with unit fixed effects. Slide unit heterogeneity up and watch POLS drift away from the truth while FE stays put. Run 100 simulations for a bias-variance picture.

Tab 3

Seven-Method Forest Plot

The post's headline figure, interactively. Toggle outcomes and methods. POLS/Between/RE land at 7-11%; FDFE/FE/TWFE/CRE land at 21%. Three-fold gap, one dataset.

Tab 4

Hausman vs Mundlak

Drive the Hausman chi-square yourself by changing the FE/RE coefficient gap and the standard errors. See why the test has low power when within-variation is thin.

Glossary (open a card if a term is unfamiliar)

Within transformation

Subtract each unit's time-series mean from each observation: \(\tilde y_{it} = y_{it} - \bar y_i\). The unit-specific intercept vanishes — only over-time variation remains.

Between estimator

Collapse each unit to its mean over time, then OLS across units. Uses cross-sectional variation only. The mirror image of FE.

Fixed effects (FE)

OLS on the demeaned data. Identifies beta from within-unit changes only. Drops time-invariant regressors (schooling, gender) mechanically.

First differences (FDFE)

Regress Δy on Δx. With T = 2 it gives the same coefficient as FE. With T > 2 the two diverge under serially correlated errors.

Two-way FE (TWFE)

Absorb both unit effects and time effects. Standard for short panels with macro shocks. Closes the FD-FE gap induced by aggregate trends.

Random effects (RE)

Treats the unit effect as a random draw uncorrelated with regressors. GLS-weighted combination of between and within. Efficient if assumption holds; biased if not.

Mundlak / CRE

Add each unit's mean of every time-varying regressor as an extra control, then run RE. The within coefficient on the original variable equals FE. The Mundlak term tests the RE assumption.

Hausman test

Compares FE and RE: H = (β_FE − β_RE)' (V_FE − V_RE)⁻¹ (β_FE − β_RE) ~ chi-square(k). Large H rejects RE.

Panel DGP Simulator — POLS vs FE on simulated data

Simulate a panel with N workers observed over T periods. Each worker has their own intercept (the unit fixed effect alpha_i) and a treatment x_it that is correlated with that intercept — so POLS will be biased. The true treatment coefficient is beta = 0.21 (matching the post's FE estimate). Slide unit heterogeneity up and watch POLS drift away from 0.21 while FE stays close.

Workers N 200

More workers ⇒ tighter estimates. Post sample: 2199.

Periods T 2

Post uses T = 2. More T ⇒ richer within-worker variation.

Unit heterogeneity 1.00

Spread of worker intercepts. 0 = identical workers (POLS unbiased), 2 = wildly heterogeneous (POLS very biased).

Selection ρ 0.70

Correlation between unit FE and treatment x. The omitted-variable bias driver.

Pooled OLS

Ignores panel structure. Each row is treated as independent.

beta-hat—

bias (vs true 0.21)—

R squared—

Fixed Effects

Unit demeaning. Within-only variation.

beta-hat—

bias (vs true 0.21)—

within R squared—

What to look for

Slide unit heterogeneity to 2.0. POLS drifts above 0.30 because high-FE workers cluster at high x; the bias is the post's omitted-variable bias made visible.
Slide selection ρ to 0. POLS becomes unbiased — the two methods agree. This is the RE assumption: no correlation between alpha_i and x.
Slide T to 8. FE's within R squared rises sharply; the FE estimate's confidence shrinks. With T = 2 (the post's choice), FE is consistent but high-variance.
True coefficient 0.21 is the dashed grey line. POLS scatters above the truth; FE clusters near it.

Bias vs variance across 100 simulations

One draw is noisy. Run the panel simulation 100 times with the current sliders to see whether POLS bias is systematic and how variance compares to FE.

Hausman vs Mundlak — when should you trust RE?

The Hausman statistic compares the FE and RE coefficient estimates, weighted by the difference in their variances: H = (beta_FE − beta_RE)² / (V_FE − V_RE). Large H ⇒ FE and RE disagree ⇒ reject RE. The catch: when within-variation is thin, V_FE is large and the test becomes mechanically underpowered. The Mundlak alternative escapes this trap.

FE coefficient β̂_FE 0.2103

Post estimate: 0.2103.

RE coefficient β̂_RE 0.1092

Post estimate: 0.1092.

FE standard error 0.0812

Post estimate: 0.0812. Slide down to see the test pick up power.

RE standard error 0.0299

Post estimate: 0.0299. RE is always more efficient.

β̂_FE − β̂_RE

—

point-estimate gap

Hausman H

—

χ²(1) under H₀: RE consistent

p-value

—

small p ⇒ reject RE

Verdict at α = 0.05

—

compare to 3.84

What to look for

Snap to post values (β̂_FE = 0.2103, β̂_RE = 0.1092, SE_FE = 0.0812, SE_RE = 0.0299). H = 1.79, p = 0.18. The test fails to reject RE — but only because SE_FE is huge.
Halve SE_FE to 0.04. Same coefficient gap, but now H jumps above 50 and p collapses to essentially zero. The "fail to reject" verdict was a power problem, not a signal that RE is correct.
Set β̂_FE = β̂_RE. H drops to 0 and p = 1. When the two estimators agree, there is no evidence against RE — which is exactly the test's logic.
The Mundlak alternative (post §13) reaches p = 0.072 on the same data, telling a more nuanced story than Hausman's 0.18 because it does not pay the same noise penalty.

Why does Mundlak beat Hausman in low-power settings?

Mundlak's specification adds x_bar (the unit mean of the time-varying regressor) to an RE regression. The coefficient on x_bar is a direct test of correlation between the unit effect and the regressor — the very assumption RE relies on. Crucially, Mundlak's standard error does not contain the V_FE − V_RE term that makes Hausman noisy in thin-within-variation panels.

In the post's data, Hausman gives p = 0.18 (fail to reject); Mundlak's t-stat on union_bar gives p = 0.072 (borderline reject). Same data, same direction of effect — different precision. For a practitioner, CRE/Mundlak is usually the right specification to lead with: you get the FE coefficient on time-varying treatments, the RE structure for keeping schooling and gender, and a built-in spec test that beats Hausman.

Connecting back to Tab 2

The Tab 2 simulator lets you drive the FE-vs-POLS bias through the selection ρ slider. The Hausman test on that simulated data would reject RE if ρ is large enough — but only if you also have enough within-variation (high T) to make SE_FE small. Cranking ρ up while keeping T = 2 reproduces exactly the post's "Hausman fails to reject but the FE-RE gap is huge" situation.

Panel Data Methods — Interactive Lab

What does panel data buy you?

The within transformation, animated

Between vs within: how much variation does FE have to work with?

Panel DGP Simulator

Seven-Method Forest Plot

Hausman vs Mundlak

Glossary (open a card if a term is unfamiliar)

Panel DGP Simulator — POLS vs FE on simulated data

Pooled OLS

Fixed Effects

What to look for

Bias vs variance across 100 simulations

The post's seven-method forest plot — interactively

What to look for

Outcomes

Methods

Why does the union premium triple?

Hausman vs Mundlak — when should you trust RE?

What to look for

Why does Mundlak beat Hausman in low-power settings?

Connecting back to Tab 2