PyFixest — Interactive Lab

A pedagogical companion to High-Dimensional Fixed Effects Regression: An Introduction in Python ↗ Back to the post

How fixed effects rewrite the data — and the union premium

Pooled OLS on the Vella–Verbeek wage panel says joining a union raises wages by 18.3%. One-way fixed effects on the same data says 7.8%. Over half of the apparent union premium reflects worker selection, not the within-worker effect of being in a union. PyFixest computes this correction with a one-line formula (lwage ~ union | nr) — but the mechanics deserve a closer look.

This app turns the post's central ideas into knobs you can move. Watch the within transformation demean three workers in real time. Simulate a panel with unit fixed effects and slide the selection knob to see pooled OLS drift away from the truth while FE stays close. Toggle the five Mincer-equation estimators in the forest plot. And drive a cluster-robust SE multiplier yourself to see how t-statistics shrink when you cluster honestly.

The within transformation, animated

Three workers, two periods each. Alice (steel) earns low wages and never joins a union. Bob (teal) switches from non-union to union between periods. Carla (orange) earns high wages and is always in a union. The dashed grey line is the pooled OLS fit — steep, because high-wage workers happen to be in unions and low-wage workers are not (positive selection). Wait a few seconds: the points slide as each worker's two-period mean is subtracted. After demeaning, only Bob moves — and the FE slope (orange) is much flatter than pooled OLS suggested.

The animation cycles between raw and demeaned coordinates on a 10-second loop. Demeaning kills two of three workers (Alice and Carla collapse to the origin) because they never switched union status. Only switchers identify the FE coefficient — that is why the pyfixest post calls out the 484 / 545 workers who change occupation and the 24% of person-years that are union-covered: those are the "movers" the within estimator actually uses.

Between vs within: how much variation does FE have to work with?

For each variable in the Mincer wage panel (N = 545 workers; T = 8 years), we decompose the standard deviation into a between part (across workers) and a within part (across years inside a worker). One-way FE only uses the within slice.

Education has zero within-variation in this sample — every worker's education is constant across the eight years, so one-way FE mechanically drops it. Union and married have substantial within shares (45% and 48%), so the within estimator has plenty of "switcher" variation to identify the union and marriage premia. Log wage's within share (48%) is what drives the R² jump from 0.175 (pooled OLS) to 0.605 (one-way FE) the post reports in §11.4.

Tab 2

Panel FE Simulator

Simulate a panel with unit fixed effects and a treatment correlated with the unobserved worker effect. Slide the selection ρ and watch pooled OLS drift away from the truth while FE stays put. Run 100 simulations for a bias-variance picture.

Tab 3

Mincer Forest Plot

The post's headline figure, interactively. Toggle variables and methods. POLS says union = +18%; FE/TWFE/CRE all say +7.5%. CRE recovers education (+9.4%) — the variable one-way FE silently dropped.

Tab 4

Clustered SE Showdown

Watch t-statistics shrink as you switch from iid to HC1 to CRV1 (cluster-robust). For the synthetic dataset, the SE on X1 grows by 50% — for weaker effects this would flip significance.

Glossary (open a card if a term is unfamiliar)

Fixed effects (FE)
A unit-specific intercept (one per worker / firm / country) absorbed via demeaning. Identifies β from within-unit changes only. Drops time-invariant regressors like education or race.
Within transformation
Subtract each unit's time-average from every observation: ÿit = yit − ȳi. Algebraically equivalent to including N unit dummies, but vastly faster.
Two-way FE (TWFE)
Absorb both unit effects (worker) and time effects (year). The post's R² climbs from 0.175 (pooled) → 0.605 (one-way) → 0.631 (TWFE) on the Mincer panel.
Three-way FE
Add a third absorbed dimension (e.g., occupation). PyFixest handles this with | nr + year + C(occupation). R² barely moves (0.631 → 0.632) once worker FE is in.
Time-invariant variables
Variables that never change within a unit (education, race). Their within-transformed values are exactly zero, so they are perfectly collinear with the entity dummies and dropped.
Cluster-robust SE (CRV1)
Standard errors that allow within-cluster correlation. Default in panel work because errors travel together within a worker / state / firm. Often 30–50% larger than iid SEs.
CRE / Mundlak
Augment a pooled regression with each unit's mean of the time-varying regressors. The time-varying coefficients equal FE; the time-invariant coefficients (education, race) become identifiable.
Event study
Drop dummies for each relative-time period around a treatment date, ref = −1. Pre-treatment coefficients diagnose parallel trends; post-treatment coefficients trace the dynamic ATT.

Panel FE Simulator — pooled OLS vs FE on simulated data

Simulate a panel with N workers observed over T periods. Each worker has their own intercept αi (unobserved ability) and a treatment xit that is correlated with that intercept — so pooled OLS will be biased. The true treatment coefficient is β = 0.10 (a 10% wage effect, in the spirit of the Mincer panel's union premium). Slide selection ρ up and watch pooled OLS drift while FE stays close to 0.10.

More workers ⇒ tighter estimates. Post sample: 545.
Post uses T = 8. More T ⇒ richer within-worker variation.
Spread of worker intercepts. 0 = identical workers (pooled OLS unbiased), 2 = wildly heterogeneous (pooled OLS very biased).
Correlation between unit FE and treatment x. The omitted-variable bias driver — the reason the union premium drops from 18% to 8% in the post.

Pooled OLS

Ignores panel structure. Each row treated as independent.

β̂
bias (vs true 0.10)

One-way FE

Worker demeaning. Within-only variation.

β̂
bias (vs true 0.10)
within R²

What to look for

  • Slide selection ρ to 0. Pooled OLS becomes unbiased — pooled OLS and FE agree. This is the RE assumption: no correlation between αi and xit.
  • Slide selection ρ to 0.9. Pooled OLS shoots above 0.30 because high-α workers also have high x. The bias is precisely the §11.4 union-selection story made visible: workers who join unions more often are systematically different.
  • Slide T from 2 to 8. FE's within R² rises; the FE estimate's variance shrinks. With T = 8 (the post's choice), FE has plenty of "switchers" to identify β.
  • True coefficient 0.10 is the dashed grey line. Pooled OLS scatters above the truth; FE clusters near it. This is the canonical fingerprint of positive selection on unobservables.

Bias vs variance across 100 simulations

One draw is noisy. Run the panel simulation 100 times with the current sliders to see whether pooled OLS bias is systematic and how variance compares to FE.

The Mincer wage panel — every estimator on one plot

These numbers come straight from §11 of the post — the Vella–Verbeek wage panel of 545 young men observed for 8 years (1980–1987). Five estimators on each of the seven Mincer regressors (union, married, expersq, hours, educ, black, hisp). Toggle to compare.

What to look for

  • The union premium drops by more than half. Pooled OLS says 0.183. One-way FE (clustered) says 0.078. TWFE: 0.073. Three-way FE: 0.075. The §11.4 selection story is visible at a glance.
  • One-way FE silently drops education, black, and hisp. No bars appear for these methods on those rows — the within transformation gives identically-zero columns for any worker-constant variable.
  • CRE recovers what one-way FE lost. The teal CRE bar gives education = 0.094 (a 9.4% return per year), black = −0.140 (14% Black wage gap), and the time-varying coefficients still match one-way FE. This is the §11.7 punchline.
  • Marriage premium is more robust to FE than union. Pooled 0.141 → FE 0.115 — only an 18% drop, vs the union's 57% drop. Marriage is less correlated with unobserved ability.

Variables

Methods

Why does the union premium drop, but the marriage premium barely move?

The drop in a coefficient when you go from pooled OLS to one-way FE measures how correlated the regressor is with unobserved worker heterogeneity (ability, motivation, industry tenure). For union the drop is large (0.183 → 0.078) — union members are systematically higher-paid independent of being in a union, perhaps because the kind of person who joins a union also tends to be employed in higher-wage industries. For married the drop is small (0.141 → 0.115) — the marriage premium is mostly a within-worker effect.

The CRE row makes this concrete. The Mundlak correction term on union_mean is +0.179 (highly significant, §11.7) — that is the cross-sectional selection effect that pooled OLS folds into the within effect, producing the inflated 0.183. The CRE married_mean coefficient is small and insignificant, so the marriage premium is barely affected by selection.

Clustered SE Showdown — same coefficient, different t-statistic

Point estimates do not care about the SE method; inference does. Five SE assumptions on the same model (§8 of the post). When you cluster by group, the SE inflates because within-group errors travel together — repeated observations of the same worker are not independent observations. The t-statistic on X1 falls from 12.2 (HC1) to 8.2 (CRV3) — a 33% reduction, with no change in the point estimate.

Pyfixest reports (from §8)

Hover any row to see the SE inflation factor relative to iid.

What to look for

  • iid → HC1 barely moves the SE (0.0858 → 0.0833). Heteroskedasticity is not a major concern in the synthetic dataset; HC1 occasionally shrinks the SE.
  • CRV1 (group_id) jumps the SE 40% (0.0833 → 0.1172). The big shift comes from accounting for within-group correlation — observations in the same group share unobserved shocks.
  • Multi-way CRV1 (group_id + f2) inflates further (0.1207). The two-way cluster correction adds protection against correlation in both dimensions simultaneously.
  • CRV3 is the most conservative (0.1247). Its finite-sample bias correction is preferred when the number of clusters is small.
Post estimate: −1.019. Vary to see how t-statistics behave for stronger/weaker effects.
Baseline (assumes independence). Post: 0.0858.
Multiplier applied to iid SE to get clustered SE. Post CRV1: 1.40 (= 0.1172 / 0.0833).
iid t-stat
|β̂| / iid SE
Clustered t-stat
|β̂| / (mult × iid SE)
iid significant at 5%?
|t| > 1.96
Clustered significant at 5%?
|t| > 1.96

Why clustering inflates SEs

If a worker's residuals are correlated across years (motivation persists), then ten years of one worker is not ten independent observations — it is closer to one or two effective observations. The iid formula counts ten observations; CRV1 corrects this by treating each cluster (worker) as the sampling unit. Cameron and Miller (2015, §2) gives the closed-form derivation; the post applies it in §8 and §11.4 with worker-level clustering (vcov={"CRV1": "nr"}).

The practical rule is to cluster at the level where treatment is assigned. For the Mincer panel that is the worker; for a DiD with state-level adoption it would be the state. Failing to cluster almost always understates SEs and falsely declares effects significant — a major source of replication failures in applied economics.

Connecting back to Tab 2

The Tab 2 simulator estimates panel FE coefficients without computing SEs — but they would behave exactly like the slider here. If you ran Tab 2 with T = 8 and clustered SEs by worker, you would find the clustered SE is roughly 1.4× the iid SE. The cluster correction is mechanical: it depends on the intra-cluster correlation of the residuals, not on the point estimate. Inflate the iid SE by the cluster-effective sample-size shrinkage factor and you have CRV1.