What does panel data buy you?
Pooled OLS on a worker wage panel says joining a union raises wages by roughly 7%. The within (fixed-effects) estimator on the very same data says about 21% — almost three times larger. That factor-of-three gap is the empirical signature of selection on unobservables, and panel methods exist to peel it apart.
This app turns the post's three central ideas into knobs you can move. Watch the within transformation demean three workers in real time. Sweep panel sample size and unit heterogeneity to see how pooled OLS drifts from the truth while fixed effects stays close. Toggle the seven canonical estimators in the forest plot. And drive the Hausman statistic yourself by changing the FE/RE coefficient gap.
The within transformation, animated
Three workers, two periods each. Alice (steel) earns high wages and never joins the union. Bob (teal) switches into the union between periods. Carla (orange) earns low wages and is always in the union. The dashed grey line is the pooled OLS fit — shallow, because between-worker differences dominate. Wait a few seconds: the points slide as each worker's two-year mean is subtracted. After demeaning, only Bob moves — and the FE slope (orange) is much steeper.
The animation cycles between raw and demeaned coordinates on a 10-second loop. Demeaning kills two of three workers (Alice and Carla collapse to the origin) because they never switched. Only switchers identify the FE coefficient — that is the post's central asymmetry.
Between vs within: how much variation does FE have to work with?
For each variable in the post's wage panel (N = 2,199 workers; T = 2), we decompose total variance into a between part (variation across workers) and a within part (variation over time inside a worker). FE only uses the within slice.
Schooling has zero within-variation in this two-period sample — no worker's education changes between 2010 and 2012, so FE mechanically drops schooling. Union has only 6% within-variation: that thin slice is what FE uses to identify the union premium. Less data, different (and arguably cleaner) parameter.
Panel DGP Simulator
Simulate a panel with unit fixed effects. Slide unit heterogeneity up and watch POLS drift away from the truth while FE stays put. Run 100 simulations for a bias-variance picture.
Seven-Method Forest Plot
The post's headline figure, interactively. Toggle outcomes and methods. POLS/Between/RE land at 7-11%; FDFE/FE/TWFE/CRE land at 21%. Three-fold gap, one dataset.
Hausman vs Mundlak
Drive the Hausman chi-square yourself by changing the FE/RE coefficient gap and the standard errors. See why the test has low power when within-variation is thin.
Glossary (open a card if a term is unfamiliar)
Within transformation
Between estimator
Fixed effects (FE)
First differences (FDFE)
Two-way FE (TWFE)
Random effects (RE)
Mundlak / CRE
Hausman test
Panel DGP Simulator — POLS vs FE on simulated data
Simulate a panel with N workers observed over T periods. Each worker has their own intercept (the unit fixed effect alpha_i) and a treatment x_it that is correlated with that intercept — so POLS will be biased. The true treatment coefficient is beta = 0.21 (matching the post's FE estimate). Slide unit heterogeneity up and watch POLS drift away from 0.21 while FE stays close.
Pooled OLS
Ignores panel structure. Each row is treated as independent.
Fixed Effects
Unit demeaning. Within-only variation.
What to look for
- Slide unit heterogeneity to 2.0. POLS drifts above 0.30 because high-FE workers cluster at high x; the bias is the post's omitted-variable bias made visible.
- Slide selection ρ to 0. POLS becomes unbiased — the two methods agree. This is the RE assumption: no correlation between alpha_i and x.
- Slide T to 8. FE's within R squared rises sharply; the FE estimate's confidence shrinks. With T = 2 (the post's choice), FE is consistent but high-variance.
- True coefficient 0.21 is the dashed grey line. POLS scatters above the truth; FE clusters near it.
Bias vs variance across 100 simulations
One draw is noisy. Run the panel simulation 100 times with the current sliders to see whether POLS bias is systematic and how variance compares to FE.
The post's seven-method forest plot — interactively
These numbers come straight from basic_models_comparison.csv
and extended_models_comparison.csv in the post's folder.
Six methods on the basic model (union as the lone regressor) and four
methods with controls (age, schooling, female, year dummies). Toggle
outcomes and methods to compare. Hover any point for SE, 95% CI, and
sample size.
What to look for
- Two clear camps on "union (basic)". POLS/Between/RE land at 6–11%; FDFE/FE/CRE land at 21%. That factor-of-three gap is the empirical signature of selection on unobservables.
- CRE equals FE exactly (0.2103 to four decimals). Mundlak's algebra is exact: add the unit means and RE recovers the within coefficient on the original variable.
- Schooling is absorbed under TWFE. No row appears for TWFE on schooling because schooling has zero within-variation. The other three methods recover 0.111.
- The age coefficient flips sign under TWFE (+0.021 in POLS/RE, −0.058 in TWFE). With T = 2 and every worker aging by 2 years, age within an individual is collinear with the year dummy — a methodological artifact, not a real effect.
Outcomes
Methods
Why does the union premium triple?
POLS asks: "how do union and non-union workers compare?" Their answer (7%) is biased if union members are systematically different from non-members on unobservables (ability, motivation, industry).
FE asks a sharper question: "what happens when the same worker switches union status?" Only the workers who actually changed union status between 2010 and 2012 contribute. The answer (21%) is what we would report if we trusted that nothing else changes for a worker over those two years.
The triple is consistent with negative selection into unions: higher-ability workers are less likely to be in unions in this sample, so cross-sectional comparisons understate the within-worker payoff to joining a union.
Hausman vs Mundlak — when should you trust RE?
The Hausman statistic compares the FE and RE coefficient estimates,
weighted by the difference in their variances:
H = (beta_FE − beta_RE)² / (V_FE − V_RE).
Large H ⇒ FE and RE disagree ⇒ reject RE. The catch: when
within-variation is thin, V_FE is large and the test becomes mechanically
underpowered. The Mundlak alternative escapes this trap.
What to look for
- Snap to post values (β̂_FE = 0.2103, β̂_RE = 0.1092, SE_FE = 0.0812, SE_RE = 0.0299). H = 1.79, p = 0.18. The test fails to reject RE — but only because SE_FE is huge.
- Halve SE_FE to 0.04. Same coefficient gap, but now H jumps above 50 and p collapses to essentially zero. The "fail to reject" verdict was a power problem, not a signal that RE is correct.
- Set β̂_FE = β̂_RE. H drops to 0 and p = 1. When the two estimators agree, there is no evidence against RE — which is exactly the test's logic.
- The Mundlak alternative (post §13) reaches p = 0.072 on the same data, telling a more nuanced story than Hausman's 0.18 because it does not pay the same noise penalty.
Why does Mundlak beat Hausman in low-power settings?
Mundlak's specification adds x_bar (the unit mean of the
time-varying regressor) to an RE regression. The coefficient on
x_bar is a direct test of correlation between the unit
effect and the regressor — the very assumption RE relies on. Crucially,
Mundlak's standard error does not contain the V_FE − V_RE
term that makes Hausman noisy in thin-within-variation panels.
In the post's data, Hausman gives p = 0.18 (fail to reject); Mundlak's
t-stat on union_bar gives p = 0.072 (borderline reject).
Same data, same direction of effect — different precision.
For a practitioner, CRE/Mundlak is usually the right
specification to lead with: you get the FE coefficient on
time-varying treatments, the RE structure for keeping schooling and
gender, and a built-in spec test that beats Hausman.
Connecting back to Tab 2
The Tab 2 simulator lets you drive the FE-vs-POLS bias through the selection ρ slider. The Hausman test on that simulated data would reject RE if ρ is large enough — but only if you also have enough within-variation (high T) to make SE_FE small. Cranking ρ up while keeping T = 2 reproduces exactly the post's "Hausman fails to reject but the FE-RE gap is huge" situation.