← Back to the post
Interactive data dictionary

Evaluating the Economic Impact of the Aceh Tsunami

Companion data for a causal-inference replication of Heger & Neumayer (2019) on fully synthetic, calibrated panels.

2
datasets
49
variables
125 / 276
districts / sub-districts
1999–2012
years

Downloads

Each dataset is available as a labeled Stata .dta and its source file.

⇩ Download all data (ZIP)stata_codebook.do

DatasetGrainRowsStataSource
aceh_tsunami_district_paneldistrict-year1,750 × 30aceh_tsunami_district_panel.dtaaceh_tsunami_district_panel.csv
aceh_tsunami_subdistrict_panelkecamatan-year3,864 × 19aceh_tsunami_subdistrict_panel.dtaaceh_tsunami_subdistrict_panel.csv

Run stata_codebook.do in Stata once to attach long-form per-variable notes to the .dta files.

Load directly in code

Every file loads straight from GitHub (raw URLs). Swap the file name to load any dataset.

Stata

* Stata 14+ : `use` reads an https URL directly
global BASE "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_did_sc_tsunami/data/"
use "${BASE}aceh_tsunami_district_panel.dta", clear
describe
notes

Python

!pip install -q pyreadstat
import pandas as pd
BASE = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_did_sc_tsunami/data/"
df = pd.read_stata(BASE + "aceh_tsunami_district_panel.dta")

# load every dataset at once
files = ["aceh_tsunami_district_panel", "aceh_tsunami_subdistrict_panel"]
data = {f: pd.read_stata(BASE + f + ".dta") for f in files}

# pyreadstat (richest metadata) reads LOCAL files -> download first
import pyreadstat, urllib.request
urllib.request.urlretrieve(BASE + "aceh_tsunami_district_panel.dta", "aceh_tsunami_district_panel.dta")
df, meta = pyreadstat.read_dta("aceh_tsunami_district_panel.dta")

Copy and paste this snippet in Google Colab app. https://colab.research.google.com/notebooks/empty.ipynb

R

# R : haven::read_dta auto-downloads an https URL
library(haven)
BASE <- "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_did_sc_tsunami/data/"
df <- read_dta(paste0(BASE, "aceh_tsunami_district_panel.dta"))

Overview & sources

Companion data for a hands-on Python tutorial that evaluates the long-run economic impact of the 2004 Indian Ocean tsunami on the Indonesian province of Aceh, replicating Heger & Neumayer (2019) on fully synthetic, calibrated panels. The post treats coastal inundation as a quasi-natural experiment and estimates a dynamic four-period difference-in-differences with pyfixest, an event study with diff-diff, a night-lights dose-response, a synthetic control with mlsynth, and Conley spatial-HAC standard errors validated by Moran’s I. Flooded districts lost about 7.9% of output in 2005 but grew 6.3 percentage points per year faster during 2006–08, and the synthetic control places flooded Aceh +18.3% above its no-tsunami counterfactual by 2012. The data-generating process is calibrated so that re-running the paper’s analyses reproduces its findings (signs, significance, approximate magnitudes); it is for teaching the methods, not for drawing new conclusions about Aceh.

Two files. aceh_tsunami_district_panel is an annual district panel (one row per district × year) of 125 Sumatran districts over 1999–2012 carrying district GDP growth, covariates, treatment indicators, and centroids for spatial inference. aceh_tsunami_subdistrict_panel is a finer annual panel of 276 Aceh sub-districts (kecamatans) over the same years, carrying satellite night-lights and continuous flood intensity for the dose-response analysis.

Data sources

SourceProvidesReference / URL
Heger &amp; Neumayer (2019)Replicated study; calibration targets (coefficient signs, significance, approximate magnitudes) and the empirical designHeger, M. P., & Neumayer, E. (2019). The impact of the Indian Ocean tsunami on Aceh's long-term economic growth. Journal of Development Economics, 141, 102365. https://doi.org/10.1016/j.jdeveco.2019.06.008
Synthetic (this study)All values — simulated via a calibrated data-generating process (open &amp; reproducible)Mendez, C. (2026). See the post's Python script script.py and reference/generate_synthetic_data.py for the full DGP.
Real-world analogues (proxied, not used directly)The constructs the synthetic series imitate: district GDP (INDO-DAPOER / SUSENAS), night-lights (DMSP-OLS), inundation maps (DLR/ZKI, Dartmouth Flood Observatory)World Bank INDO-DAPOER (https://datacatalog.worldbank.org/); NOAA DMSP-OLS Nighttime Lights (https://www.ncei.noaa.gov/); DLR/ZKI & Dartmouth Flood Observatory inundation maps.
Method referencesEstimators and conceptsAbadie, Diamond & Hainmueller (2010, synthetic control); Conley (1999, spatial-HAC standard errors).

Cite this data

Please cite this dataset as follows.

APA

Mendez, C. (2026). Bouncing Back Better? Evaluating the Economic Impact of the Aceh Tsunami [Data set]. https://carlos-mendez.org/post/python_did_sc_tsunami/

Heger, M. P., & Neumayer, E. (2019). The impact of the Indian Ocean tsunami on Aceh's long-term economic growth. Journal of Development Economics, 141, 102365. https://doi.org/10.1016/j.jdeveco.2019.06.008

BibTeX

@misc{mendez2026pythondidsctsunami,
  author       = {Mendez, Carlos},
  title        = {Bouncing Back Better? Evaluating the Economic Impact of the Aceh Tsunami},
  year         = {2026},
  howpublished = {\url{https://carlos-mendez.org/post/python_did_sc_tsunami/}},
  note         = {Data set}
}

@article{heger2019impact,
  author  = {Heger, Martin Philipp and Neumayer, Eric},
  title   = {The impact of the Indian Ocean tsunami on Aceh's long-term economic growth},
  journal = {Journal of Development Economics},
  volume  = {141}, pages = {102365}, year = {2019},
  doi     = {10.1016/j.jdeveco.2019.06.008}
}

Variable explorer search & filter all 41 variables

Type to filter by name or label, or use the chips to filter by type. Each row shows a mini distribution. Click a header to sort.

VariableTypeDistributionLabelDefinitionUnitsIn filesSource
area_km2#continuousmin 31.5 | median 173 | max 967Area (km^2)Approximate land area of the kecamatan; sets the number of night-light pixels.km^2aceh_tsunami_subdistrict_panelSimulation
avg_luminosity#continuousmin 0 | median 0.844 | max 56.3Average luminosity (DN 0-63)Mean Digital Number (brightness) across the kecamatan's pixels; flooded 2004 mean ~5.79, non-flooded ~2.36.DN (0-63)aceh_tsunami_subdistrict_panelSimulation
capital_formation_pc_usd#continuousmin 2 | median 72.6 | max 234Capital formation per capita (current USD)Gross capital formation per capita; reproduces the post-tsunami investment bonanza.current USD per capitaaceh_tsunami_district_panelSimulation
coastal#dummyshare coded 1 = 0.696Coastal dummy (1=coastal)1 if the district lies on the coast, 0 if inland.0/1aceh_tsunami_district_panelAssigned (this study)
district_id#identifierDistrict IDUnique identifier for the district (Kabupaten/Kota); the panel key.stringaceh_tsunami_district_panelAssigned (this study)
district_name#identifierDistrict nameName of the district. Real names for Aceh's 23 districts; systematic placeholders elsewhere.stringaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelAssigned (this study)
district_type#identifierDistrict typeKota (urban city district) vs Kabupaten (rural regency).{Kota, Kabupaten}aceh_tsunami_district_panelAssigned (this study)
doctors_per_1000#continuousmin 0.152 | median 0.318 | max 0.55Doctors per 1,000Physicians per 1,000 people; Aceh rises faster after 2005 (synthetic-control predictor).per 1,000aceh_tsunami_district_panelSimulation
electricity_access_pct#continuousmin 74.3 | median 87.5 | max 99.9Electricity access (%)% of households with electricity; Aceh 84% (2004) -> 97% (2012).%aceh_tsunami_district_panelSimulation
flood_intensity_quintile#identifierFlood-intensity quintile (1-5; 0=non-flooded)Quintile of the flooding-intensity distribution among flooded units; only the top quintile shows a significant effect.0-5aceh_tsunami_subdistrict_panelDerived
flood_treatment_group#identifierTreatment-group labelReadable label combining treatment status and region (convenience for selecting control pools).categoryaceh_tsunami_district_panelDerived
flooded#dummyshare coded 1 = 0.096Flooded / treated dummy (1=flooded)Treatment indicator: 1 if the unit was flooded by the 2004 tsunami (the DiD 'D' variable).0/1aceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelAssigned (this study)
gdp_const_usd_m#continuousmin 33.1 | median 476 | max 3.75e+03Real GDP (million constant 2004 USD)District real GDP excluding oil & gas, constant 2004 USD, millions.million constant 2004 USDaceh_tsunami_district_panelSimulation
gdp_growth#continuousmin -0.168 | median 0.0521 | max 0.292GDP growth rate (log difference)Annual growth rate of real district GDP (main dependent variable).proportion/yraceh_tsunami_district_panelSimulation
gdp_pc_growth#continuousmin -0.204 | median 0.0375 | max 0.296GDP per-capita growth rateAnnual growth rate of real GDP per capita; no significant 2005 loss, significant 2006-08 gain.proportion/yraceh_tsunami_district_panelDerived
gdp_pc_usd#continuousmin 389 | median 1.23e+03 | max 5.89e+03GDP per capita (constant 2004 USD)Real GDP per capita.constant 2004 USDaceh_tsunami_district_panelDerived
hdi#continuousmin 64.6 | median 69 | max 74.2Human Development Index (0-100)Human Development Index; Aceh ~69 -> ~73 over the period.index 0-100aceh_tsunami_district_panelSimulation
kecamatan_id#identifierSub-district (kecamatan) IDUnique identifier for the sub-district; the sub-district panel key.stringaceh_tsunami_subdistrict_panelAssigned (this study)
kecamatan_name#identifierSub-district nameReadable name linking the kecamatan to its parent district.stringaceh_tsunami_subdistrict_panelAssigned (this study)
latitude#continuousmin -5.36 | median 0.268 | max 5.82Latitude (decimal degrees, +N)Unit-centroid latitude; time-invariant. Enables Conley spatial standard errors.degreesaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelAssigned (this study)
longitude#continuousmin 95.3 | median 101 | max 108Longitude (decimal degrees, +E)Unit-centroid longitude; time-invariant. Used with latitude for haversine distances.degreesaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelAssigned (this study)
n_pixels#continuousmin 37 | median 201 | max 1.12e+03Pixel countNumber of ~0.86 km^2 night-light grid cells in the kecamatan.countaceh_tsunami_subdistrict_panelDerived
neighbour_of_flooded#dummyshare coded 1 = 0.064Neighbour-of-flooded dummy (1=neighbour)1 if a non-flooded district borders a flooded one (placebo-treated in the robustness test).0/1aceh_tsunami_district_panelAssigned (this study)
nl_growth#continuousmin -0.0706 | median 0.032 | max 0.157Night-lights growth rate (log difference)Annual growth rate of log night-lights (main sub-district dependent variable).proportion/yraceh_tsunami_subdistrict_panelSimulation
nl_log#continuousmin -6.81 | median 4.95 | max 9.59Log luminosity (log DN-sum)log( sum of (DN + 0.001) ) — the transformed regression variable matching the paper's log night-lights.log DN-sumaceh_tsunami_subdistrict_panelSimulation
nl_sum#continuousmin 0.0001 | median 142 | max 1.46e+04Summed luminosity (DN-sum)Sum of Digital Numbers over all pixels in the kecamatan (the unit-level activity measure).DN-sumaceh_tsunami_subdistrict_panelDerived
period#identifierDiD event-time periodEvent-time period for the staggered DiD dummies; baseline 2000-02 is the omitted reference.categoryaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelDerived
pop_growth#continuousmin -0.102 | median 0.0158 | max 0.0833Population growth rateAnnual population growth rate (carries the 2005 death/displacement shock).proportion/yraceh_tsunami_district_panelSimulation
population#continuousmin 5.91e+04 | median 4.08e+05 | max 3.61e+06Population (persons)District population; drives the per-capita denominator.personsaceh_tsunami_district_panelSimulation
post#dummyshare coded 1 = 0.571Post-tsunami dummy (1=2005+)1 for years 2005 and later (simple pre/post split).0/1aceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelDerived
poverty_rate#continuousmin 9.09 | median 18 | max 32.2Poverty rate (%)Share of population below the poverty line; Aceh improves after 2005 (synthetic-control predictor).%aceh_tsunami_district_panelSimulation
province#identifierProvinceIndonesian province the unit belongs to (district panel: 10 Sumatra provinces; sub-district panel: always Aceh).stringaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelAssigned (this study)
region_group#identifierRegion groupCoarse grouping used to build estimation samples.{Aceh, North Sumatra, Rest of Sumatra}aceh_tsunami_district_panelDerived
sanitation_access_pct#continuousmin 42.7 | median 58.7 | max 79.2Sanitation access (%)% of households with sanitation access; Aceh boosted after 2005 (synthetic-control predictor).%aceh_tsunami_district_panelSimulation
share_area_flooded#continuousmin 0 | median 0 | max 0.00541Share of area floodedShare of the kecamatan's physical area flooded; tiny mean (~1.2%) gives a large coefficient.0-1aceh_tsunami_subdistrict_panelSimulation
share_pop_flooded#continuousmin 0 | median 0 | max 0.558Share of population floodedShare of the kecamatan's population in flooded area — the headline continuous dose.0-1aceh_tsunami_subdistrict_panelSimulation
va_agri_share#continuousmin 8.64 | median 44.1 | max 62.7Agriculture VA share (% of GDP)Agriculture value added as % of GDP; Aceh falls 44->32% after 2004.% of GDPaceh_tsunami_district_panelSimulation
va_manu_share#continuousmin 1.28 | median 5.68 | max 9.82Manufacturing VA share (% of GDP)Manufacturing value added as % of GDP; Aceh falls ~6->3.5% after 2004.% of GDPaceh_tsunami_district_panelSimulation
va_serv_share#continuousmin 21.6 | median 40.8 | max 77.5Services VA share (% of GDP)Services / tertiary value added as % of GDP; Aceh rises ~40->55% after 2004.% of GDPaceh_tsunami_district_panelSimulation
water_access_pct#continuousmin 52.8 | median 67.2 | max 87.6Water access (%)% of households with clean-water access; Aceh boosted after 2005 (synthetic-control predictor).%aceh_tsunami_district_panelSimulation
year#yearCalendar yearCalendar year of the observation. Panel spans 1999-2012 (levels); growth rates defined 2000-2012.yearaceh_tsunami_district_panel, aceh_tsunami_subdistrict_panelSimulation

Cross-file variable index

Which file each variable appears in (● = present).

Construction & formulas

The disaster is treated as a quasi-natural experiment: coastal geography (not economic prospects) decided which districts the wave flooded. Treatment is split into event-time windows measured against the omitted 2000–02 baseline: pre (2003–04), tsunami (2005), recovery (2006–08), and postrec (2009–12).

Synthetic data-generating process. GDP and night-lights levels are cumulated from a base year using simulated growth series built as district/kecamatan FE + year FE + a treated×period increment + spatial & serial shocks + Gaussian noise, tuned so a fixed-effects DiD recovers the paper’s coefficients column by column (within about 0.005 on the headline cells). Centroids and continuous flood doses are injected so the spatial standard errors and the dose-response behave like the paper’s without moving the point estimates.

The datasets

Switch datasets with the tabs. Each shows the full variable dictionary plus a sortable statistics table with mini distributions and data coverage.

expand to search (Ctrl/⌘+F) or print across all datasets

district-year  1,750 × 30 · 1999-2012 · 125 Sumatran districts (10 flooded Aceh treated)

Panel key: district_id x year · Dynamic DiD on district GDP growth, the synthetic control, and Conley spatial inference.

Variable dictionary

VariableLabelDefinitionConstructionUnitsSourceCoverage
district_id identifierDistrict IDUnique identifier for the district (Kabupaten/Kota); the panel key.Assigned as <PROVINCE-ABBREV>_D<nn>; Aceh districts carry real names in district_name.stringAssigned (this study)125 districts
district_name identifierDistrict nameName of the district. Real names for Aceh's 23 districts; systematic placeholders elsewhere.Real Aceh district names hand-coded from the paper's maps; other provinces use 'Province District k'.stringAssigned (this study)125 districts / 276 kecamatans
province identifierProvinceIndonesian province the unit belongs to (district panel: 10 Sumatra provinces; sub-district panel: always Aceh).Hand-coded; 10 Sumatra provinces in the district panel, constant 'Aceh' in the sub-district panel.stringAssigned (this study)
region_group identifierRegion groupCoarse grouping used to build estimation samples.Derived from province: Aceh, North Sumatra, or Rest of Sumatra.{Aceh, North Sumatra, Rest of Sumatra}Derived
district_type identifierDistrict typeKota (urban city district) vs Kabupaten (rural regency).Hand-coded to reproduce the paper's city/rural sample sizes (Aceh: 5 Kota, 18 Kabupaten).{Kota, Kabupaten}Assigned (this study)
coastal dummyCoastal dummy (1=coastal)1 if the district lies on the coast, 0 if inland.Hand-coded; all 10 treated districts are coastal. Used to drop inland controls.0/1Assigned (this study)
flooded dummyFlooded / treated dummy (1=flooded)Treatment indicator: 1 if the unit was flooded by the 2004 tsunami (the DiD 'D' variable).Hand-coded from the inundation maps; district panel: 10 Aceh + 2 North Sumatra island districts; sub-district panel: 68 of 276 kecamatans.0/1Assigned (this study)
neighbour_of_flooded dummyNeighbour-of-flooded dummy (1=neighbour)1 if a non-flooded district borders a flooded one (placebo-treated in the robustness test).Hand-coded adjacency for the placebo test; flooded districts are dropped in that test.0/1Assigned (this study)
flood_treatment_group identifierTreatment-group labelReadable label combining treatment status and region (convenience for selecting control pools).Derived from flooded + region_group.categoryDerived
latitude continuousLatitude (decimal degrees, +N)Unit-centroid latitude; time-invariant. Enables Conley spatial standard errors.Real approximate centroids for Aceh's 23 districts; synthetic non-Aceh districts drawn within the province bounding box. Sub-districts: parent centroid + ~20 km jitter.degreesAssigned (this study)
longitude continuousLongitude (decimal degrees, +E)Unit-centroid longitude; time-invariant. Used with latitude for haversine distances.See latitude. Used for the Conley spatial kernel (≤100 km).degreesAssigned (this study)
year yearCalendar yearCalendar year of the observation. Panel spans 1999-2012 (levels); growth rates defined 2000-2012.Annual index.yearSimulation
post dummyPost-tsunami dummy (1=2005+)1 for years 2005 and later (simple pre/post split).1 if year >= 2005.0/1Derived
period identifierDiD event-time periodEvent-time period for the staggered DiD dummies; baseline 2000-02 is the omitted reference.Mapped from year: '(base year)' 1999, baseline 2000-02, pre 2003-04, tsunami 2005, recovery 2006-08, postrec 2009-12.categoryDerived
gdp_const_usd_m continuousReal GDP (million constant 2004 USD)District real GDP excluding oil & gas, constant 2004 USD, millions.Cumulated from the 1999 base level using the simulated gdp_growth series.million constant 2004 USDSimulationdistrict panel
gdp_growth continuousGDP growth rate (log difference)Annual growth rate of real district GDP (main dependent variable).district FE + year FE + treated increment (city/rural) + Aceh-control spillover + spatial & serial shocks + N(0,0.04) noise.proportion/yrSimulation1621 rows (NaN in 1999; Subulussalam 2003-06)
population continuousPopulation (persons)District population; drives the per-capita denominator.Cumulated from the 1999 base using pop_growth (flooded districts lose ~9.6% in 2005).personsSimulationdistrict panel
pop_growth continuousPopulation growth rateAnnual population growth rate (carries the 2005 death/displacement shock).district FE + year FE + flooded x population increment (death shock 2005) + noise.proportion/yrSimulationNaN in 1999; Subulussalam 2003-06
gdp_pc_usd continuousGDP per capita (constant 2004 USD)Real GDP per capita.gdp_const_usd_m * 1e6 / population.constant 2004 USDDeriveddistrict panel
gdp_pc_growth continuousGDP per-capita growth rateAnnual growth rate of real GDP per capita; no significant 2005 loss, significant 2006-08 gain.gdp_growth - pop_growth (reproduces the paper's per-capita table by construction).proportion/yrDerivedNaN where growth missing
va_agri_share continuousAgriculture VA share (% of GDP)Agriculture value added as % of GDP; Aceh falls 44->32% after 2004.Province trajectory + district/type offset.% of GDPSimulation
va_manu_share continuousManufacturing VA share (% of GDP)Manufacturing value added as % of GDP; Aceh falls ~6->3.5% after 2004.Province trajectory + noise.% of GDPSimulation
va_serv_share continuousServices VA share (% of GDP)Services / tertiary value added as % of GDP; Aceh rises ~40->55% after 2004.Province trajectory + offset.% of GDPSimulation
capital_formation_pc_usd continuousCapital formation per capita (current USD)Gross capital formation per capita; reproduces the post-tsunami investment bonanza.Smooth path + reconstruction spike (peak 2006) for Aceh; smooth for donors.current USD per capitaSimulation
poverty_rate continuousPoverty rate (%)Share of population below the poverty line; Aceh improves after 2005 (synthetic-control predictor).District base + downward trend.%Simulation
doctors_per_1000 continuousDoctors per 1,000Physicians per 1,000 people; Aceh rises faster after 2005 (synthetic-control predictor).District base + upward trend.per 1,000Simulation
water_access_pct continuousWater access (%)% of households with clean-water access; Aceh boosted after 2005 (synthetic-control predictor).District base + upward trend.%Simulation
sanitation_access_pct continuousSanitation access (%)% of households with sanitation access; Aceh boosted after 2005 (synthetic-control predictor).District base + upward trend.%Simulation
electricity_access_pct continuousElectricity access (%)% of households with electricity; Aceh 84% (2004) -> 97% (2012).Aceh boosted path; others smooth upward trend.%Simulation
hdi continuousHuman Development Index (0-100)Human Development Index; Aceh ~69 -> ~73 over the period.Aceh boosted path; others smooth upward trend.index 0-100Simulation

Distribution & statistics (click a header to sort)

VariableDistributionCoverageNDistinctMinMeanMedianMaxSD
district_id100%1,750125
district_name100%1,750125
province100%1,75010
region_group100%1,7503
district_type100%1,7502
coastalshare coded 1 = 0.696100%1,750200.6961.001.000.460
floodedshare coded 1 = 0.096100%1,750200.09601.000.295
neighbour_of_floodedshare coded 1 = 0.064100%1,750200.06401.000.245
flood_treatment_group100%1,7504
latitudemin -5.36 | median 0.268 | max 5.82100%1,750125-5.360.0670.2685.823.23
longitudemin 95.3 | median 101 | max 108100%1,75012195.32100.9100.6108.33.11
year100%1,7501419992005.5200520124.03
postshare coded 1 = 0.571100%1,750200.5711.001.000.495
period100%1,7506
gdp_const_usd_mmin 33.1 | median 476 | max 3.75e+03100%1,7501,74933.09671.2476.33,748.4594.0
gdp_growthmin -0.168 | median 0.0521 | max 0.29293%1,6211,561-0.1680.0520.0520.2920.066
populationmin 5.91e+04 | median 4.08e+05 | max 3.61e+06100%1,7501,74859,071484,305407,6313,613,412400,842
pop_growthmin -0.102 | median 0.0158 | max 0.083393%1,6211,453-0.1020.0160.0160.0830.022
gdp_pc_usdmin 389 | median 1.23e+03 | max 5.89e+03100%1,7501,740389.41,421.01,226.25,885.3791.1
gdp_pc_growthmin -0.204 | median 0.0375 | max 0.29693%1,6211,554-0.2040.0360.0370.2960.070
va_agri_sharemin 8.64 | median 44.1 | max 62.7100%1,7501,6968.6440.8244.0862.7411.66
va_manu_sharemin 1.28 | median 5.68 | max 9.82100%1,7501,5401.285.395.689.821.87
va_serv_sharemin 21.6 | median 40.8 | max 77.5100%1,7501,71821.6344.2140.8377.5111.80
capital_formation_pc_usdmin 2 | median 72.6 | max 234100%1,7501,6492.0077.6572.60233.751.44
poverty_ratemin 9.09 | median 18 | max 32.2100%1,7501,0389.0918.5818.0232.194.29
doctors_per_1000min 0.152 | median 0.318 | max 0.55100%1,7503370.1520.3210.3180.5500.081
water_access_pctmin 52.8 | median 67.2 | max 87.6100%1,7501,26752.7567.5567.2387.566.49
sanitation_access_pctmin 42.7 | median 58.7 | max 79.2100%1,7501,27142.6658.8758.7279.257.34
electricity_access_pctmin 74.3 | median 87.5 | max 99.9100%1,7501,22474.3187.8187.4699.925.96
hdimin 64.6 | median 69 | max 74.2100%1,75069864.5669.0769.0174.221.94

kecamatan-year  3,864 × 19 · 1999-2012 · 276 Aceh sub-districts (kecamatans); 68 flooded

Panel key: kecamatan_id x year · Night-lights dose-response — continuous flood intensity on sub-district luminosity growth.

Variable dictionary

VariableLabelDefinitionConstructionUnitsSourceCoverage
kecamatan_id identifierSub-district (kecamatan) IDUnique identifier for the sub-district; the sub-district panel key.Assigned KEC_001 .. KEC_276.stringAssigned (this study)276 kecamatans
kecamatan_name identifierSub-district nameReadable name linking the kecamatan to its parent district.Built as <ParentDistrict>_Kec_<nn>.stringAssigned (this study)
district_name identifierDistrict nameName of the district. Real names for Aceh's 23 districts; systematic placeholders elsewhere.Real Aceh district names hand-coded from the paper's maps; other provinces use 'Province District k'.stringAssigned (this study)125 districts / 276 kecamatans
province identifierProvinceIndonesian province the unit belongs to (district panel: 10 Sumatra provinces; sub-district panel: always Aceh).Hand-coded; 10 Sumatra provinces in the district panel, constant 'Aceh' in the sub-district panel.stringAssigned (this study)
flooded dummyFlooded / treated dummy (1=flooded)Treatment indicator: 1 if the unit was flooded by the 2004 tsunami (the DiD 'D' variable).Hand-coded from the inundation maps; district panel: 10 Aceh + 2 North Sumatra island districts; sub-district panel: 68 of 276 kecamatans.0/1Assigned (this study)
share_pop_flooded continuousShare of population floodedShare of the kecamatan's population in flooded area — the headline continuous dose.latent dose * 0.62 + noise (flooded only); GRUMP-population analogue.0-1Simulationflooded kecamatans only (0 elsewhere)
share_area_flooded continuousShare of area floodedShare of the kecamatan's physical area flooded; tiny mean (~1.2%) gives a large coefficient.latent dose * 0.00566 + noise (flooded only).0-1Simulationflooded kecamatans only (0 elsewhere)
flood_intensity_quintile identifierFlood-intensity quintile (1-5; 0=non-flooded)Quintile of the flooding-intensity distribution among flooded units; only the top quintile shows a significant effect.qcut of the latent dose into 5 groups among flooded kecamatans; 0 for non-flooded.0-5Derived
area_km2 continuousArea (km^2)Approximate land area of the kecamatan; sets the number of night-light pixels.Drawn lognormal (30-1500 km^2).km^2Simulation
n_pixels continuousPixel countNumber of ~0.86 km^2 night-light grid cells in the kecamatan.round(area_km2 / 0.86); 30x30 arc-second pixels ~0.86 km^2 at the equator.countDerived
latitude continuousLatitude (decimal degrees, +N)Unit-centroid latitude; time-invariant. Enables Conley spatial standard errors.Real approximate centroids for Aceh's 23 districts; synthetic non-Aceh districts drawn within the province bounding box. Sub-districts: parent centroid + ~20 km jitter.degreesAssigned (this study)
longitude continuousLongitude (decimal degrees, +E)Unit-centroid longitude; time-invariant. Used with latitude for haversine distances.See latitude. Used for the Conley spatial kernel (≤100 km).degreesAssigned (this study)
year yearCalendar yearCalendar year of the observation. Panel spans 1999-2012 (levels); growth rates defined 2000-2012.Annual index.yearSimulation
post dummyPost-tsunami dummy (1=2005+)1 for years 2005 and later (simple pre/post split).1 if year >= 2005.0/1Derived
period identifierDiD event-time periodEvent-time period for the staggered DiD dummies; baseline 2000-02 is the omitted reference.Mapped from year: '(base year)' 1999, baseline 2000-02, pre 2003-04, tsunami 2005, recovery 2006-08, postrec 2009-12.categoryDerived
avg_luminosity continuousAverage luminosity (DN 0-63)Mean Digital Number (brightness) across the kecamatan's pixels; flooded 2004 mean ~5.79, non-flooded ~2.36.nl_sum / n_pixels, top-coded at 63 (DMSP saturation).DN (0-63)Simulation
nl_sum continuousSummed luminosity (DN-sum)Sum of Digital Numbers over all pixels in the kecamatan (the unit-level activity measure).exp(nl_log) - 0.001.DN-sumDerived
nl_log continuousLog luminosity (log DN-sum)log( sum of (DN + 0.001) ) — the transformed regression variable matching the paper's log night-lights.Cumulated from the 2004 anchor using the nl_growth series.log DN-sumSimulation
nl_growth continuousNight-lights growth rate (log difference)Annual growth rate of log night-lights (main sub-district dependent variable).kecamatan FE + year FE + theta(period)*dose^2 (flooded) + N(0,0.005) noise.proportion/yrSimulationNaN in 1999

Distribution & statistics (click a header to sort)

VariableDistributionCoverageNDistinctMinMeanMedianMaxSD
kecamatan_id100%3,864276
kecamatan_name100%3,864276
district_name100%3,86423
province100%3,8641
floodedshare coded 1 = 0.246100%3,864200.24601.000.431
share_pop_floodedmin 0 | median 0 | max 0.558100%3,8646800.05000.5580.107
share_area_floodedmin 0 | median 0 | max 0.00541100%3,8645904.51e-0400.0059.96e-04
flood_intensity_quintile100%3,8646
area_km2min 31.5 | median 173 | max 967100%3,86426631.50217.0172.9966.8146.9
n_pixelsmin 37 | median 201 | max 1.12e+03100%3,86420237.00252.3201.01,124.0170.8
latitudemin 2.06 | median 4.61 | max 5.94100%3,8642742.064.384.615.940.967
longitudemin 95.1 | median 96.9 | max 98.4100%3,86427695.0696.7696.8998.360.895
year100%3,8641419992005.5200520124.03
postshare coded 1 = 0.571100%3,864200.5711.001.000.495
period100%3,8646
avg_luminositymin 0 | median 0.844 | max 56.3100%3,8642,36403.390.84456.326.26
nl_summin 0.0001 | median 142 | max 1.46e+04100%3,8642,4271.00e-04775.4141.814,5531,674.2
nl_logmin -6.81 | median 4.95 | max 9.59100%3,8643,725-6.811.494.959.596.13
nl_growthmin -0.0706 | median 0.032 | max 0.15789%3,4443,056-0.0710.0320.0320.1570.035

Known limitations & caveats