High-Dimensional Fixed Effects (PyFixest)

How fixed effects rewrite the data — and the union premium

Pooled OLS on the Vella–Verbeek wage panel says joining a union raises wages by 18.3%. One-way fixed effects on the same data says 7.8%. Over half of the apparent union premium reflects worker selection, not the within-worker effect of being in a union. PyFixest computes this correction with a one-line formula (lwage ~ union | nr) — but the mechanics deserve a closer look.

This app turns the post's central ideas into knobs you can move. Watch the within transformation demean three workers in real time. Simulate a panel with unit fixed effects and slide the selection knob to see pooled OLS drift away from the truth while FE stays close. Toggle the five Mincer-equation estimators in the forest plot. And drive a cluster-robust SE multiplier yourself to see how t-statistics shrink when you cluster honestly.

The within transformation, animated

Three workers, two periods each. Alice (steel) earns low wages and never joins a union. Bob (teal) switches from non-union to union between periods. Carla (orange) earns high wages and is always in a union. The dashed grey line is the pooled OLS fit — steep, because high-wage workers happen to be in unions and low-wage workers are not (positive selection). Wait a few seconds: the points slide as each worker's two-period mean is subtracted. After demeaning, only Bob moves — and the FE slope (orange) is much flatter than pooled OLS suggested.

The animation cycles between raw and demeaned coordinates on a 10-second loop. Demeaning kills two of three workers (Alice and Carla collapse to the origin) because they never switched union status. Only switchers identify the FE coefficient — that is why the pyfixest post calls out the 484 / 545 workers who change occupation and the 24% of person-years that are union-covered: those are the "movers" the within estimator actually uses.

Between vs within: how much variation does FE have to work with?

For each variable in the Mincer wage panel (N = 545 workers; T = 8 years), we decompose the standard deviation into a between part (across workers) and a within part (across years inside a worker). One-way FE only uses the within slice.

Education has zero within-variation in this sample — every worker's education is constant across the eight years, so one-way FE mechanically drops it. Union and married have substantial within shares (45% and 48%), so the within estimator has plenty of "switcher" variation to identify the union and marriage premia. Log wage's within share (48%) is what drives the R² jump from 0.175 (pooled OLS) to 0.605 (one-way FE) the post reports in §11.4.

Tab 2

Panel FE Simulator

Simulate a panel with unit fixed effects and a treatment correlated with the unobserved worker effect. Slide the selection ρ and watch pooled OLS drift away from the truth while FE stays put. Run 100 simulations for a bias-variance picture.

Tab 3

Mincer Forest Plot

The post's headline figure, interactively. Toggle variables and methods. POLS says union = +18%; FE/TWFE/CRE all say +7.5%. CRE recovers education (+9.4%) — the variable one-way FE silently dropped.

Tab 4

Clustered SE Showdown

Watch t-statistics shrink as you switch from iid to HC1 to CRV1 (cluster-robust). For the synthetic dataset, the SE on X1 grows by 50% — for weaker effects this would flip significance.

Glossary (open a card if a term is unfamiliar)

Fixed effects (FE)

A unit-specific intercept (one per worker / firm / country) absorbed via demeaning. Identifies β from within-unit changes only. Drops time-invariant regressors like education or race.

Within transformation

Subtract each unit's time-average from every observation: ÿ_it = y_it − ȳ_i. Algebraically equivalent to including N unit dummies, but vastly faster.

Two-way FE (TWFE)

Absorb both unit effects (worker) and time effects (year). The post's R² climbs from 0.175 (pooled) → 0.605 (one-way) → 0.631 (TWFE) on the Mincer panel.

Three-way FE

Add a third absorbed dimension (e.g., occupation). PyFixest handles this with | nr + year + C(occupation). R² barely moves (0.631 → 0.632) once worker FE is in.

Time-invariant variables

Variables that never change within a unit (education, race). Their within-transformed values are exactly zero, so they are perfectly collinear with the entity dummies and dropped.

Cluster-robust SE (CRV1)

Standard errors that allow within-cluster correlation. Default in panel work because errors travel together within a worker / state / firm. Often 30–50% larger than iid SEs.

CRE / Mundlak

Augment a pooled regression with each unit's mean of the time-varying regressors. The time-varying coefficients equal FE; the time-invariant coefficients (education, race) become identifiable.

Event study

Drop dummies for each relative-time period around a treatment date, ref = −1. Pre-treatment coefficients diagnose parallel trends; post-treatment coefficients trace the dynamic ATT.

Panel FE Simulator — pooled OLS vs FE on simulated data

Simulate a panel with N workers observed over T periods. Each worker has their own intercept α_i (unobserved ability) and a treatment x_it that is correlated with that intercept — so pooled OLS will be biased. The true treatment coefficient is β = 0.10 (a 10% wage effect, in the spirit of the Mincer panel's union premium). Slide selection ρ up and watch pooled OLS drift while FE stays close to 0.10.

Workers N 200

More workers ⇒ tighter estimates. Post sample: 545.

Periods T 4

Post uses T = 8. More T ⇒ richer within-worker variation.

Unit heterogeneity 1.00

Spread of worker intercepts. 0 = identical workers (pooled OLS unbiased), 2 = wildly heterogeneous (pooled OLS very biased).

Selection ρ 0.70

Correlation between unit FE and treatment x. The omitted-variable bias driver — the reason the union premium drops from 18% to 8% in the post.

Pooled OLS

Ignores panel structure. Each row treated as independent.

β̂—

bias (vs true 0.10)—

R²—

One-way FE

Worker demeaning. Within-only variation.

β̂—

bias (vs true 0.10)—

within R²—

What to look for

Slide selection ρ to 0. Pooled OLS becomes unbiased — pooled OLS and FE agree. This is the RE assumption: no correlation between α_i and x_it.
Slide selection ρ to 0.9. Pooled OLS shoots above 0.30 because high-α workers also have high x. The bias is precisely the §11.4 union-selection story made visible: workers who join unions more often are systematically different.
Slide T from 2 to 8. FE's within R² rises; the FE estimate's variance shrinks. With T = 8 (the post's choice), FE has plenty of "switchers" to identify β.
True coefficient 0.10 is the dashed grey line. Pooled OLS scatters above the truth; FE clusters near it. This is the canonical fingerprint of positive selection on unobservables.

Bias vs variance across 100 simulations

One draw is noisy. Run the panel simulation 100 times with the current sliders to see whether pooled OLS bias is systematic and how variance compares to FE.

Clustered SE Showdown — same coefficient, different t-statistic

Point estimates do not care about the SE method; inference does. Five SE assumptions on the same model (§8 of the post). When you cluster by group, the SE inflates because within-group errors travel together — repeated observations of the same worker are not independent observations. The t-statistic on X1 falls from 12.2 (HC1) to 8.2 (CRV3) — a 33% reduction, with no change in the point estimate.

Pyfixest reports (from §8)

Hover any row to see the SE inflation factor relative to iid.

What to look for

iid → HC1 barely moves the SE (0.0858 → 0.0833). Heteroskedasticity is not a major concern in the synthetic dataset; HC1 occasionally shrinks the SE.
CRV1 (group_id) jumps the SE 40% (0.0833 → 0.1172). The big shift comes from accounting for within-group correlation — observations in the same group share unobserved shocks.
Multi-way CRV1 (group_id + f2) inflates further (0.1207). The two-way cluster correction adds protection against correlation in both dimensions simultaneously.
CRV3 is the most conservative (0.1247). Its finite-sample bias correction is preferred when the number of clusters is small.

Coefficient β̂ −1.019

Post estimate: −1.019. Vary to see how t-statistics behave for stronger/weaker effects.

iid SE 0.0858

Baseline (assumes independence). Post: 0.0858.

Cluster inflation factor 1.40

Multiplier applied to iid SE to get clustered SE. Post CRV1: 1.40 (= 0.1172 / 0.0833).

iid t-stat

—

|β̂| / iid SE

Clustered t-stat

—

|β̂| / (mult × iid SE)

iid significant at 5%?

—

|t| > 1.96

Clustered significant at 5%?

—

|t| > 1.96

Why clustering inflates SEs

If a worker's residuals are correlated across years (motivation persists), then ten years of one worker is not ten independent observations — it is closer to one or two effective observations. The iid formula counts ten observations; CRV1 corrects this by treating each cluster (worker) as the sampling unit. Cameron and Miller (2015, §2) gives the closed-form derivation; the post applies it in §8 and §11.4 with worker-level clustering (vcov={"CRV1": "nr"}).

The practical rule is to cluster at the level where treatment is assigned. For the Mincer panel that is the worker; for a DiD with state-level adoption it would be the state. Failing to cluster almost always understates SEs and falsely declares effects significant — a major source of replication failures in applied economics.

Connecting back to Tab 2

The Tab 2 simulator estimates panel FE coefficients without computing SEs — but they would behave exactly like the slider here. If you ran Tab 2 with T = 8 and clustered SEs by worker, you would find the clustered SE is roughly 1.4× the iid SE. The cluster correction is mechanical: it depends on the intra-cluster correlation of the residuals, not on the point estimate. Inflate the iid SE by the cluster-effective sample-size shrinkage factor and you have CRV1.

PyFixest — Interactive Lab

How fixed effects rewrite the data — and the union premium

The within transformation, animated

Between vs within: how much variation does FE have to work with?

Panel FE Simulator

Mincer Forest Plot

Clustered SE Showdown

Glossary (open a card if a term is unfamiliar)

Panel FE Simulator — pooled OLS vs FE on simulated data

Pooled OLS

One-way FE

What to look for

Bias vs variance across 100 simulations

The Mincer wage panel — every estimator on one plot

What to look for

Variables

Methods

Why does the union premium drop, but the marriage premium barely move?

Clustered SE Showdown — same coefficient, different t-statistic

Pyfixest reports (from §8)

What to look for

Why clustering inflates SEs

Connecting back to Tab 2