Downloads
All data are free to download. Each dataset comes in two identical forms — Stata .dta (with embedded variable and value labels) and plain .csv.
⇩ Download all data (ZIP)stata_codebook.do
| Dataset | Grain | Rows | Stata | CSV |
|---|---|---|---|---|
Prediction_Data | region-year | 5,258 × 30 | Prediction_Data.dta | Prediction_Data.csv |
Table_2_data | region-year | 5,258 × 8 | Table_2_data.dta | Table_2_data.csv |
Table_3_data | country-year | 3,675 × 9 | Table_3_data.dta | Table_3_data.csv |
Table_4_data | country-year | 3,675 × 17 | Table_4_data.dta | Table_4_data.csv |
Table_B4_data | region-year | 5,258 × 14 | Table_B4_data.dta | Table_B4_data.csv |
Figure_5_data | country-year | 3,675 × 5 | Figure_5_data.dta | Figure_5_data.csv |
Run stata_codebook.do in Stata once to attach long-form per-variable notes to the .dta files.
Load directly in code
Every file loads straight from GitHub (raw URLs — robust and stable) — no manual download needed (except pyreadstat, which reads local files). Swap the file name to load any of the six datasets.
Stata
* Stata 14+ : `use` reads an https URL directly
global BASE "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_kuznets_dmsp/data/"
use "${BASE}Table_3_data.dta", clear
describe // variable + value labels
notes // long-form documentation (after running stata_codebook.do)Python
!pip install -q pyreadstat
# Python : pandas reads a .dta URL directly (values + variable labels)
import pandas as pd
BASE = "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_kuznets_dmsp/data/"
df = pd.read_stata(BASE + "Table_3_data.dta")
# load all six datasets at once
files = ["Prediction_Data", "Table_2_data", "Table_3_data",
"Table_4_data", "Table_B4_data", "Figure_5_data"]
data = {f: pd.read_stata(BASE + f + ".dta") for f in files}
# pyreadstat exposes the richest metadata but reads LOCAL files -> download first
import pyreadstat, urllib.request
urllib.request.urlretrieve(BASE + "Table_3_data.dta", "Table_3_data.dta")
df, meta = pyreadstat.read_dta("Table_3_data.dta")Copy and paste this snippet in Google Colab app. https://colab.research.google.com/notebooks/empty.ipynb
R
# R : haven::read_dta auto-downloads an https URL
library(haven)
BASE <- "https://raw.githubusercontent.com/cmg777/starter-academic-v501/master/content/post/python_kuznets_dmsp/data/"
df <- read_dta(paste0(BASE, "Table_3_data.dta")) # labels via attr(df$var, "label")Overview & sources
Companion data for the post Regional Inequality from Outer Space, a Python replication of Lessmann & Seidel (2017): predict regional GDP per capita from DMSP-OLS nighttime lights, build five population-weighted inequality indices, and estimate the spatial (regional) Kuznets curve and its determinants across up to 180 countries, 1992–2012.
Country_ISO × year over 180 countries (1992–2012). The country-file inequality indices are built from the predicted regional incomes in the region files.Data sources
| Source | Provides | Reference / URL |
|---|---|---|
| NOAA/NGDC DMSP-OLS stable lights | Nighttime lights (DMSP-OLS stable lights v4, DN 0-63) | NOAA National Geophysical Data Center. https://www.ngdc.noaa.gov/eog/dmsp.html |
| Gennaioli et al. (2014) | Observed regional GDP per capita (training target) | Gennaioli, La Porta, Lopez-de-Silanes & Shleifer (2014), J. Economic Growth 19(3). |
| World Bank WDI | National accounts, determinants (GDP, trade, FDI, rents, etc.) | World Bank, World Development Indicators. https://databank.worldbank.org/source/world-development-indicators |
| GADM (Global Administrative Areas) | Administrative boundaries, region names, areas, centroids | GADM database of Global Administrative Areas. https://gadm.org |
| GPW v3 (CIESIN) | Gridded population (region and country totals) | CIESIN, Gridded Population of the World v3. https://sedac.ciesin.columbia.edu |
| Polity IV (Center for Systemic Peace) | Democracy-autocracy score (Polity2) | Center for Systemic Peace, Polity IV project. https://www.systemicpeace.org/inscrdata.html |
| GREG (Weidmann et al. 2010) + NOAA/NGDC | Ethnic homelands for the ethnic-inequality light Gini | Weidmann, Rod & Cederman (2010), J. Peace Research 47(4). |
| Lessmann & Seidel (2017) | Original study replicated here; interpersonal Gini (Giniall) | Lessmann & Seidel (2017), 'Regional inequality, convergence, and its determinants - A view from outer space', European Economic Review 92: 110-132. |
Cite this data
Please cite both this dataset/replication and the original study.
APA
Mendez, C. (2026). Regional inequality from outer space: Predicting GDP from nighttime lights and building inequality indices in Python [Data set]. https://carlos-mendez.org/post/python_kuznets_dmsp/
Lessmann, C., & Seidel, A. (2017). Regional inequality, convergence, and its determinants — A view from outer space. European Economic Review, 92, 110–132.BibTeX
@misc{mendez2026kuznetsdmsp,
author = {Mendez, Carlos},
title = {Regional Inequality from Outer Space: Predicting GDP from Nighttime Lights and Building Inequality Indices in Python},
year = {2026},
howpublished = {\url{https://carlos-mendez.org/post/python_kuznets_dmsp/}},
note = {Data set; replication of Lessmann and Seidel (2017)}
}
@article{lessmann2017regional,
author = {Lessmann, Christian and Seidel, Andr\'{e}},
title = {Regional inequality, convergence, and its determinants---A view from outer space},
journal = {European Economic Review},
volume = {92},
pages = {110--132},
year = {2017}
}Variable explorer search & filter all 53 variables
Type to filter by name or label, or use the chips to filter by type. Each row shows a mini distribution of the variable. Click a column header to sort.
| Variable | Type | Distribution | Label | Definition | Units | In files | Source |
|---|---|---|---|---|---|---|---|
Aid# | continuous | Net official development assistance (2011 US$) | Net official development assistance received | US$ (2011) | Table_4_data | World Bank WDI | |
Arable_land# | continuous | Arable land (share of land area) | Arable land as a share of land area (FAO definition) | share | Table_4_data | World Bank WDI | |
COVW_pred_GDP_pc# | continuous | Pop-weighted coefficient of variation (pred income) | Population-weighted coefficient of variation of predicted regional income | >=0 | Table_3_data | This study (derived) | |
Country_ISO# | identifier | – | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | string | Prediction_Data, Table_2_data, Table_3_data, Table_4_data, Table_B4_data, Figure_5_data | GADM (Global Administrative Areas) |
Country_NAME# | identifier | – | Country name | Country name (English) | string | Prediction_Data, Table_3_data, Table_4_data, Figure_5_data | GADM (Global Administrative Areas) |
FDI_share_of_GDP# | continuous | FDI openness: net FDI inflows / GDP | Net foreign direct investment inflows as a share of GDP | ratio | Table_4_data | World Bank WDI | |
GDP_pc_Country# | continuous | National GDP per capita (2005 PPP US$) | National GDP per capita | US$ (2005 PPP) | Table_3_data, Table_4_data | World Bank WDI | |
GDP_pc_Region# | continuous | Observed regional GDP per capita (2005 PPP US$) | Observed regional GDP per capita (training target) | US$ (2005 PPP) | Prediction_Data, Table_2_data | Gennaioli et al. (2014) | |
GE_0W_pred_GDP_pc# | continuous | Pop-weighted mean log deviation GE(alpha=0) | Population-weighted mean log deviation of predicted regional income | >=0 | Table_3_data | This study (derived) | |
GE_1W_pred_GDP_pc# | continuous | Pop-weighted Theil index GE(alpha=1) | Population-weighted Theil index of predicted regional income | >=0 | Table_3_data | This study (derived) | |
GE_m1W_pred_GDP_pc# | continuous | Pop-weighted generalized entropy GE(alpha=-1) | Population-weighted GE(-1) of predicted regional income | >=0 | Table_3_data | This study (derived) | |
GINIW_Eth_light# | continuous | Ethnic inequality: pop-weighted light Gini | Population-weighted light-Gini computed across ethnic homelands | 0-1 | Table_4_data | GREG (Weidmann et al. 2010) + NOAA/NGDC | |
GINIW_pred_GDP_pc# | continuous | Pop-weighted regional Gini (predicted income) | Population-weighted Gini of predicted regional income within a country-year | 0-1 | Table_3_data, Table_4_data, Figure_5_data | This study (derived) | |
Giniall# | continuous | National interpersonal income Gini (0-100) | Household-survey interpersonal income Gini | 0-100 | Figure_5_data | Lessmann & Seidel (2017) | |
Latitude# | continuous | Region centroid latitude (degrees) | Latitude of the region polygon centroid | degrees | Table_B4_data | GADM (Global Administrative Areas) | |
Light_Country# | continuous | Country total nighttime lights (summed DN) | Sum of pixel digital numbers over the whole country | summed DN | Table_2_data | NOAA/NGDC DMSP-OLS stable lights | |
Light_Region# | continuous | Regional total nighttime lights (summed DN) | Sum of pixel digital numbers over the region | summed DN | Table_2_data | NOAA/NGDC DMSP-OLS stable lights | |
Longitude# | continuous | Region centroid longitude (degrees) | Longitude of the region polygon centroid | degrees | Table_B4_data | GADM (Global Administrative Areas) | |
Polity2# | continuous | Polity IV democracy-autocracy score (-1..+1) | Rescaled Polity IV combined democracy-autocracy score | -1..+1 | Table_4_data | Polity IV (Center for Systemic Peace) | |
Pop_Country# | continuous | Country total population (persons) | Total population of the country | persons | Prediction_Data, Table_2_data, Table_4_data | GPW v3 (CIESIN) | |
Pop_Region# | continuous | Regional total population (persons) | Total population of the region | persons | Prediction_Data, Table_2_data | GPW v3 (CIESIN) | |
Region_NAME# | identifier | – | First-level administrative region name | Name of the first-level admin unit (state/province/canton) | string | Prediction_Data | GADM (Global Administrative Areas) |
Resources_rents_share_of_GDP# | continuous | Natural-resource rents (% of GDP) | Total natural-resource rents as a share of GDP | % GDP | Table_4_data | World Bank WDI | |
School_enrollment_secondary# | continuous | Gross secondary-school enrolment (% gross) | Gross secondary-school enrolment ratio (>100% with over-age pupils) | % gross | Table_4_data | World Bank WDI | |
Trade_GDP_share# | continuous | Trade openness (exports+imports)/GDP | Trade as a share of GDP | ratio | Table_4_data | World Bank WDI | |
area# | continuous | Country land area (km^2) | Total land area excluding inland water | km^2 | Table_4_data | World Bank WDI | |
code_Coutry_Region# | identifier | – | Numeric region key (orig. spelling 'Coutry' kept) | Numeric identifier for a region (unique within country) | integer | Prediction_Data, Table_B4_data | Authors' replication archive |
eap# | dummy | World Bank region dummy: East Asia & Pacific | 1 if the country is in East Asia & Pacific (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
eca# | dummy | World Bank region dummy: Europe & Central Asia | 1 if the country is in Europe & Central Asia (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
fedelupd2# | dummy | Federal-state dummy (1=federal) | 1 if the country is federally organised | 0/1 | Table_4_data | Authors' replication archive | |
id_t_j# | identifier | – | Country-year key (year+ISO, e.g. 2010CHE) | Concatenated year and ISO code | string | Prediction_Data | Authors' replication archive |
lac# | dummy | World Bank region dummy: Latin America & Caribbean | 1 if the country is in Latin America & Caribbean (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
log_GDP_pc_Country# | continuous | Log national GDP per capita | Natural log of national GDP per capita | log US$ | Prediction_Data | World Bank WDI | |
log_GDP_pc_Region# | continuous | Log observed regional GDP per capita | Natural log of GDP_pc_Region | log US$ | Prediction_Data, Table_B4_data | Gennaioli et al. (2014) | |
log_Light_ppix_Region# | continuous | Log avg nighttime light per pixel (region) | Natural log of the region mean DMSP-OLS stable-lights digital number | log DN | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
log_N_pix_low_cod_1_ppix# | continuous | Log count of low-coded pixels (DN=0) | Log number of dark (low-coded) pixels in the region | log count | Prediction_Data | NOAA/NGDC DMSP-OLS stable lights | |
log_N_pix_top_cod_1_ppix# | continuous | Log count of top-coded pixels (DN=63) | Log number of saturated (top-coded) pixels in the region | log count | Prediction_Data | NOAA/NGDC DMSP-OLS stable lights | |
log_area# | continuous | Log region area (km^2) | Natural log of the region polygon area | log km^2 | Prediction_Data | GADM (Global Administrative Areas) | |
log_region# | continuous | Log number of regions in the country | Log count of first-level regions per country | log count | Prediction_Data | GADM (Global Administrative Areas) | |
log_region_X_log_area# | continuous | Interaction: log_region x log_area | Product of log_region and log_area | - | Prediction_Data | This study (derived) | |
mena# | dummy | World Bank region dummy: Middle East & North Africa | 1 if the country is in Middle East & North Africa (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
pred_GDP_pc_Region# | continuous | Predicted regional GDP per capita (2005 PPP US$) | Model-predicted regional GDP per capita | US$ (2005 PPP) | Table_2_data | This study (derived) | |
price_gasoline# | continuous | Gasoline pump price (2005 PPP US$/litre) | Pump price for gasoline | US$/litre | Table_4_data | World Bank WDI | |
sa# | dummy | World Bank region dummy: South Asia | 1 if the country is in South Asia (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
satyear_1# | dummy | Satellite/sensor-era dummy 1 (of 7) | 1 for DMSP satellite/sensor configuration era 1 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_2# | dummy | Satellite/sensor-era dummy 2 (of 7) | 1 for DMSP satellite/sensor configuration era 2 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_3# | dummy | Satellite/sensor-era dummy 3 (of 7) | 1 for DMSP satellite/sensor configuration era 3 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_4# | dummy | Satellite/sensor-era dummy 4 (of 7) | 1 for DMSP satellite/sensor configuration era 4 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_5# | dummy | Satellite/sensor-era dummy 5 (of 7) | 1 for DMSP satellite/sensor configuration era 5 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_6# | dummy | Satellite/sensor-era dummy 6 (of 7) | 1 for DMSP satellite/sensor configuration era 6 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
satyear_7# | dummy | Satellite/sensor-era dummy 7 (of 7) | 1 for DMSP satellite/sensor configuration era 7 | 0/1 | Prediction_Data, Table_B4_data | NOAA/NGDC DMSP-OLS stable lights | |
ssa# | dummy | World Bank region dummy: Sub-Saharan Africa | 1 if the country is in Sub-Saharan Africa (North America = reference) | 0/1 | Prediction_Data | World Bank WDI | |
year# | year | – | Calendar year | Year of observation | year | Prediction_Data, Table_2_data, Table_3_data, Table_4_data, Table_B4_data, Figure_5_data | - |
Cross-file variable index
Which file each variable appears in (● = present).
| Variable | Prediction_Data | Table_2_data | Table_3_data | Table_4_data | Table_B4_data | Figure_5_data |
|---|---|---|---|---|---|---|
Aid | ● | |||||
Arable_land | ● | |||||
COVW_pred_GDP_pc | ● | |||||
Country_ISO | ● | ● | ● | ● | ● | ● |
Country_NAME | ● | ● | ● | ● | ||
FDI_share_of_GDP | ● | |||||
GDP_pc_Country | ● | ● | ||||
GDP_pc_Region | ● | ● | ||||
GE_0W_pred_GDP_pc | ● | |||||
GE_1W_pred_GDP_pc | ● | |||||
GE_m1W_pred_GDP_pc | ● | |||||
GINIW_Eth_light | ● | |||||
GINIW_pred_GDP_pc | ● | ● | ● | |||
Giniall | ● | |||||
Latitude | ● | |||||
Light_Country | ● | |||||
Light_Region | ● | |||||
Longitude | ● | |||||
Polity2 | ● | |||||
Pop_Country | ● | ● | ● | |||
Pop_Region | ● | ● | ||||
Region_NAME | ● | |||||
Resources_rents_share_of_GDP | ● | |||||
School_enrollment_secondary | ● | |||||
Trade_GDP_share | ● | |||||
area | ● | |||||
code_Coutry_Region | ● | ● | ||||
eap | ● | |||||
eca | ● | |||||
fedelupd2 | ● | |||||
id_t_j | ● | |||||
lac | ● | |||||
log_GDP_pc_Country | ● | |||||
log_GDP_pc_Region | ● | ● | ||||
log_Light_ppix_Region | ● | ● | ||||
log_N_pix_low_cod_1_ppix | ● | |||||
log_N_pix_top_cod_1_ppix | ● | |||||
log_area | ● | |||||
log_region | ● | |||||
log_region_X_log_area | ● | |||||
mena | ● | |||||
pred_GDP_pc_Region | ● | |||||
price_gasoline | ● | |||||
sa | ● | |||||
satyear_1 | ● | ● | ||||
satyear_2 | ● | ● | ||||
satyear_3 | ● | ● | ||||
satyear_4 | ● | ● | ||||
satyear_5 | ● | ● | ||||
satyear_6 | ● | ● | ||||
satyear_7 | ● | ● | ||||
ssa | ● | |||||
year | ● | ● | ● | ● | ● | ● |
Construction & formulas
All five inequality indices are computed within each country-year, across that
country's regions, on predicted regional income y = pred_GDP_pc_Region,
weighted by the regional population share p_i = Pop_Region_i / Pop_Country. Let
ybar = sum_i p_i * y_i be the population-weighted mean.
- Population-weighted Gini (
GINIW_pred_GDP_pc, 0–1):GINIW = ( sum_i sum_j p_i p_j |y_i - y_j| ) / (2 * ybar). - Coefficient of variation (
COVW_pred_GDP_pc):COVW = sqrt( sum_i p_i (y_i - ybar)^2 ) / ybar. - Generalized entropy
GE(alpha)(GE_1W,GE_0W,GE_m1W): for alpha not in {0,1},GE(alpha) = 1/(alpha(alpha-1)) * sum_i p_i [ (y_i/ybar)^alpha - 1 ];GE(1)= Theil =sum_i p_i (y_i/ybar) ln(y_i/ybar);GE(0)= mean log deviation =sum_i p_i ln(ybar/y_i);GE_m1Wuses alpha = -1.
Other constructed variables:
- Log nighttime light (
log_Light_ppix_Region):ln(meanDN), with the region mean DN set to0.01when it is 0 so the log is defined (DMSP DN range 0–63). - Prediction model (eq. 1): a random-effects regression of
log_GDP_pc_Regiononlog_Light_ppix_Region, the top-/low-coded pixel counts,log_area,log_region,log_region_X_log_area, the World Bank region dummies, the satellite-era dummies, andlog_GDP_pc_Country;pred_GDP_pc_Regionis the back-transformed fitted value. - Transport cost (paper's proxy, not stored):
area * price_gasoline.
The six datasets
Switch datasets with the tabs. Each shows the full variable dictionary plus a sortable statistics table with mini distributions and data coverage.
expand to search (Ctrl/⌘+F) or print across all six datasets
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
Region_NAME identifier | First-level administrative region name | Name of the first-level admin unit (state/province/canton) | GADM admin-1 name | string | GADM (Global Administrative Areas) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Country_NAME identifier | Country name | Country name (English) | From GADM country attributes | string | GADM (Global Administrative Areas) | country/region files |
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
id_t_j identifier | Country-year key (year+ISO, e.g. 2010CHE) | Concatenated year and ISO code | year concatenated with Country_ISO | string | Authors' replication archive | region frame (Prediction) |
code_Coutry_Region identifier | Numeric region key (orig. spelling 'Coutry' kept) | Numeric identifier for a region (unique within country) | Region identifier carried verbatim from the authors' archive | integer | Authors' replication archive | 1992-2010 · 1,504 reg (81 ctry) · region frame |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
Pop_Region continuous | Regional total population (persons) | Total population of the region | Population density x region area, rounded up (min 1); 5-yr waves interpolated to annual | persons | GPW v3 (CIESIN) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Pop_Country continuous | Country total population (persons) | Total population of the country | Sum of regional populations | persons | GPW v3 (CIESIN) | region & country frames |
GDP_pc_Region continuous | Observed regional GDP per capita (2005 PPP US$) | Observed regional GDP per capita (training target) | Regional accounts, constant 2005 PPP US$ | US$ (2005 PPP) | Gennaioli et al. (2014) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_GDP_pc_Region continuous | Log observed regional GDP per capita | Natural log of GDP_pc_Region | ln(GDP_pc_Region) | log US$ | Gennaioli et al. (2014) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_Light_ppix_Region continuous | Log avg nighttime light per pixel (region) | Natural log of the region mean DMSP-OLS stable-lights digital number | ln(mean DN); mean set to 0.01 when 0 so the log is defined; DN ranges 0-63 | log DN | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_GDP_pc_Country continuous | Log national GDP per capita | Natural log of national GDP per capita | ln(national GDP per capita) | log US$ | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_N_pix_top_cod_1_ppix continuous | Log count of top-coded pixels (DN=63) | Log number of saturated (top-coded) pixels in the region | ln(count of DN=63 pixels) per region; controls for sensor saturation | log count | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_N_pix_low_cod_1_ppix continuous | Log count of low-coded pixels (DN=0) | Log number of dark (low-coded) pixels in the region | ln(count of DN=0 pixels) per region; controls for sparse/rural area | log count | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_area continuous | Log region area (km^2) | Natural log of the region polygon area | ln(region area in km^2) | log km^2 | GADM (Global Administrative Areas) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_region continuous | Log number of regions in the country | Log count of first-level regions per country | ln(number of regions in the country) | log count | GADM (Global Administrative Areas) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_region_X_log_area continuous | Interaction: log_region x log_area | Product of log_region and log_area | log_region * log_area | - | This study (derived) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
eap dummy | World Bank region dummy: East Asia & Pacific | 1 if the country is in East Asia & Pacific (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
eca dummy | World Bank region dummy: Europe & Central Asia | 1 if the country is in Europe & Central Asia (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
lac dummy | World Bank region dummy: Latin America & Caribbean | 1 if the country is in Latin America & Caribbean (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
mena dummy | World Bank region dummy: Middle East & North Africa | 1 if the country is in Middle East & North Africa (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
sa dummy | World Bank region dummy: South Asia | 1 if the country is in South Asia (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
ssa dummy | World Bank region dummy: Sub-Saharan Africa | 1 if the country is in Sub-Saharan Africa (North America = reference) | World Bank regional grouping indicator | 0/1 | World Bank WDI | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_1 dummy | Satellite/sensor-era dummy 1 (of 7) | 1 for DMSP satellite/sensor configuration era 1 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_2 dummy | Satellite/sensor-era dummy 2 (of 7) | 1 for DMSP satellite/sensor configuration era 2 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_3 dummy | Satellite/sensor-era dummy 3 (of 7) | 1 for DMSP satellite/sensor configuration era 3 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_4 dummy | Satellite/sensor-era dummy 4 (of 7) | 1 for DMSP satellite/sensor configuration era 4 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_5 dummy | Satellite/sensor-era dummy 5 (of 7) | 1 for DMSP satellite/sensor configuration era 5 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_6 dummy | Satellite/sensor-era dummy 6 (of 7) | 1 for DMSP satellite/sensor configuration era 6 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_7 dummy | Satellite/sensor-era dummy 7 (of 7) | 1 for DMSP satellite/sensor configuration era 7 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
Region_NAME | – | 100% | 5,258 | 1,483 | — | — | — | — | — |
Country_NAME | – | 100% | 5,258 | 82 | — | — | — | — | — |
Country_ISO | – | 100% | 5,258 | 81 | — | — | — | — | — |
id_t_j | – | 100% | 5,258 | 277 | — | — | — | — | — |
code_Coutry_Region | – | 100% | 5,258 | 1,504 | — | — | — | — | — |
year | – | 100% | 5,258 | 19 | 1992 | 2002.2 | 2000 | 2010 | 5.51 |
Pop_Region | 100% | 5,258 | 5,258 | 928.4 | 3,705,986 | 964,556 | 199,528,672 | 11,695,457 | |
Pop_Country | 100% | 5,258 | 277 | 1,193,269 | 103,683,791 | 38,461,096 | 1,328,343,680 | 228,114,600 | |
GDP_pc_Region | 100% | 5,258 | 5,207 | 226.3 | 14,371 | 8,770.2 | 150,768 | 13,450 | |
log_GDP_pc_Region | 100% | 5,258 | 5,161 | 5.42 | 9.10 | 9.08 | 11.92 | 1.06 | |
log_Light_ppix_Region | 100% | 5,258 | 5,184 | -4.61 | 0.957 | 1.25 | 4.14 | 1.77 | |
log_GDP_pc_Country | 100% | 5,258 | 277 | 6.07 | 9.28 | 9.26 | 11.45 | 0.939 | |
log_N_pix_top_cod_1_ppix | 100% | 5,258 | 3,820 | -20.75 | -10.53 | -12.37 | 4.20e-05 | 4.31 | |
log_N_pix_low_cod_1_ppix | 100% | 5,258 | 5,135 | -15.16 | -1.55 | -0.523 | -8.34e-05 | 2.83 | |
log_area | 100% | 5,258 | 81 | 9.91 | 13.17 | 13.01 | 16.61 | 1.74 | |
log_region | 100% | 5,258 | 34 | 1.39 | 3.19 | 3.18 | 4.34 | 0.698 | |
log_region_X_log_area | 100% | 5,258 | 83 | 17.15 | 42.57 | 42.05 | 72.16 | 12.98 | |
eap | 100% | 5,258 | 2 | 0 | 0.201 | 0 | 1.00 | 0.401 | |
eca | 100% | 5,258 | 2 | 0 | 0.468 | 0 | 1.00 | 0.499 | |
lac | 100% | 5,258 | 2 | 0 | 0.165 | 0 | 1.00 | 0.372 | |
mena | 100% | 5,258 | 2 | 0 | 0.041 | 0 | 1.00 | 0.198 | |
sa | 100% | 5,258 | 2 | 0 | 0.044 | 0 | 1.00 | 0.204 | |
ssa | 100% | 5,258 | 2 | 0 | 0.034 | 0 | 1.00 | 0.181 | |
satyear_1 | 100% | 5,258 | 2 | 0 | 0.016 | 0 | 1.00 | 0.125 | |
satyear_2 | 100% | 5,258 | 2 | 0 | 0.004 | 0 | 1.00 | 0.062 | |
satyear_3 | 100% | 5,258 | 2 | 0 | 0.230 | 0 | 1.00 | 0.421 | |
satyear_4 | 100% | 5,258 | 2 | 0 | 0.022 | 0 | 1.00 | 0.146 | |
satyear_5 | 100% | 5,258 | 2 | 0 | 0.251 | 0 | 1.00 | 0.434 | |
satyear_6 | 100% | 5,258 | 2 | 0 | 0.246 | 0 | 1.00 | 0.431 | |
satyear_7 | 100% | 5,258 | 2 | 0 | 0.032 | 0 | 1.00 | 0.176 |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
pred_GDP_pc_Region continuous | Predicted regional GDP per capita (2005 PPP US$) | Model-predicted regional GDP per capita | Back-transformed fitted values of the eq.-1 random-effects model | US$ (2005 PPP) | This study (derived) | region frame (Table_2) |
GDP_pc_Region continuous | Observed regional GDP per capita (2005 PPP US$) | Observed regional GDP per capita (training target) | Regional accounts, constant 2005 PPP US$ | US$ (2005 PPP) | Gennaioli et al. (2014) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Light_Region continuous | Regional total nighttime lights (summed DN) | Sum of pixel digital numbers over the region | Sum of DMSP-OLS stable-lights DN over the region's pixels | summed DN | NOAA/NGDC DMSP-OLS stable lights | region frame (Table_2) |
Light_Country continuous | Country total nighttime lights (summed DN) | Sum of pixel digital numbers over the whole country | Sum of DMSP-OLS stable-lights DN over all country pixels | summed DN | NOAA/NGDC DMSP-OLS stable lights | region frame (Table_2) |
Pop_Region continuous | Regional total population (persons) | Total population of the region | Population density x region area, rounded up (min 1); 5-yr waves interpolated to annual | persons | GPW v3 (CIESIN) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Pop_Country continuous | Country total population (persons) | Total population of the country | Sum of regional populations | persons | GPW v3 (CIESIN) | region & country frames |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
Country_ISO | – | 100% | 5,258 | 81 | — | — | — | — | — |
year | – | 100% | 5,258 | 19 | 1992 | 2002.2 | 2000 | 2010 | 5.51 |
pred_GDP_pc_Region | 100% | 5,258 | 5,254 | 360.1 | 13,422 | 8,324.9 | 70,638 | 11,689 | |
GDP_pc_Region | 100% | 5,258 | 5,207 | 226.3 | 14,371 | 8,770.2 | 150,768 | 13,450 | |
Light_Region | 100% | 5,258 | 5,199 | 44.00 | 213,456 | 58,401 | 7,904,552 | 465,243 | |
Light_Country | 100% | 5,258 | 277 | 12,106 | 7,477,449 | 1,733,508 | 83,312,528 | 14,801,072 | |
Pop_Region | 100% | 5,258 | 5,258 | 928.4 | 3,705,986 | 964,556 | 199,528,672 | 11,695,457 | |
Pop_Country | 100% | 5,258 | 277 | 1,193,269 | 103,683,791 | 38,461,096 | 1,328,343,680 | 228,114,600 |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
Country_NAME identifier | Country name | Country name (English) | From GADM country attributes | string | GADM (Global Administrative Areas) | country/region files |
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
GDP_pc_Country continuous | National GDP per capita (2005 PPP US$) | National GDP per capita | World Bank WDI, constant 2005 PPP US$ | US$ (2005 PPP) | World Bank WDI | 1992-2012 · 180 ctry · country frame |
GINIW_pred_GDP_pc continuous | Pop-weighted regional Gini (predicted income) | Population-weighted Gini of predicted regional income within a country-year | Gini of pred_GDP_pc_Region across regions, weighted by Pop_Region, per country-year | 0-1 | This study (derived) | 1992-2012 · 180 ctry · country frame |
COVW_pred_GDP_pc continuous | Pop-weighted coefficient of variation (pred income) | Population-weighted coefficient of variation of predicted regional income | pop-weighted SD / pop-weighted mean of pred_GDP_pc_Region, per country-year | >=0 | This study (derived) | 1992-2012 · 180 ctry · country frame |
GE_1W_pred_GDP_pc continuous | Pop-weighted Theil index GE(alpha=1) | Population-weighted Theil index of predicted regional income | Generalized entropy GE(alpha=1) of pred_GDP_pc_Region, pop-weighted, per country-year | >=0 | This study (derived) | 1992-2012 · 180 ctry · country frame |
GE_0W_pred_GDP_pc continuous | Pop-weighted mean log deviation GE(alpha=0) | Population-weighted mean log deviation of predicted regional income | Generalized entropy GE(alpha=0) of pred_GDP_pc_Region, pop-weighted, per country-year | >=0 | This study (derived) | 1992-2012 · 180 ctry · country frame |
GE_m1W_pred_GDP_pc continuous | Pop-weighted generalized entropy GE(alpha=-1) | Population-weighted GE(-1) of predicted regional income | Generalized entropy GE(alpha=-1) of pred_GDP_pc_Region, pop-weighted, per country-year | >=0 | This study (derived) | 1992-2012 · 180 ctry · country frame |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
Country_NAME | – | 100% | 3,675 | 180 | — | — | — | — | — |
Country_ISO | – | 100% | 3,675 | 180 | — | — | — | — | — |
year | – | 100% | 3,675 | 21 | 1992 | 2002.1 | 2002 | 2012 | 6.03 |
GDP_pc_Country | 100% | 3,675 | 3,675 | 126.4 | 12,572 | 6,864.4 | 119,068 | 15,364 | |
GINIW_pred_GDP_pc | 100% | 3,675 | 3,674 | 0.001 | 0.064 | 0.061 | 0.163 | 0.033 | |
COVW_pred_GDP_pc | 100% | 3,675 | 3,674 | 0.003 | 0.127 | 0.116 | 0.365 | 0.069 | |
GE_1W_pred_GDP_pc | 100% | 3,675 | 3,675 | 4.13e-06 | 0.010 | 0.007 | 0.058 | 0.010 | |
GE_0W_pred_GDP_pc | 100% | 3,675 | 3,675 | 4.12e-06 | 0.010 | 0.007 | 0.051 | 0.009 | |
GE_m1W_pred_GDP_pc | 100% | 3,675 | 3,675 | 4.12e-06 | 0.010 | 0.007 | 0.047 | 0.009 |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
Country_NAME identifier | Country name | Country name (English) | From GADM country attributes | string | GADM (Global Administrative Areas) | country/region files |
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
GINIW_pred_GDP_pc continuous | Pop-weighted regional Gini (predicted income) | Population-weighted Gini of predicted regional income within a country-year | Gini of pred_GDP_pc_Region across regions, weighted by Pop_Region, per country-year | 0-1 | This study (derived) | 1992-2012 · 180 ctry · country frame |
GDP_pc_Country continuous | National GDP per capita (2005 PPP US$) | National GDP per capita | World Bank WDI, constant 2005 PPP US$ | US$ (2005 PPP) | World Bank WDI | 1992-2012 · 180 ctry · country frame |
Pop_Country continuous | Country total population (persons) | Total population of the country | Sum of regional populations | persons | GPW v3 (CIESIN) | region & country frames |
Resources_rents_share_of_GDP continuous | Natural-resource rents (% of GDP) | Total natural-resource rents as a share of GDP | Oil + gas + coal + mineral + forest rents, % of GDP | % GDP | World Bank WDI | 177 ctry · N=3,620 |
Arable_land continuous | Arable land (share of land area) | Arable land as a share of land area (FAO definition) | Arable land / total land area | share | World Bank WDI | 178 ctry · N=3,603 |
Trade_GDP_share continuous | Trade openness (exports+imports)/GDP | Trade as a share of GDP | (Exports + imports) / GDP | ratio | World Bank WDI | 176 ctry · N=3,509 |
FDI_share_of_GDP continuous | FDI openness: net FDI inflows / GDP | Net foreign direct investment inflows as a share of GDP | Net FDI inflows / GDP | ratio | World Bank WDI | 174 ctry · N=3,477 |
area continuous | Country land area (km^2) | Total land area excluding inland water | World Bank WDI land area | km^2 | World Bank WDI | 1992-2012 · 180 ctry · country frame |
price_gasoline continuous | Gasoline pump price (2005 PPP US$/litre) | Pump price for gasoline | Pump price, PPP constant 2005 US$/litre; paper's transport cost = area x price_gasoline | US$/litre | World Bank WDI | 162 ctry · N=1,366 |
Aid continuous | Net official development assistance (2011 US$) | Net official development assistance received | Net ODA received, constant 2011 US$ | US$ (2011) | World Bank WDI | 155 ctry · N=2,964 |
School_enrollment_secondary continuous | Gross secondary-school enrolment (% gross) | Gross secondary-school enrolment ratio (>100% with over-age pupils) | Secondary enrolment / age-eligible population | % gross | World Bank WDI | 172 ctry · N=2,566 |
GINIW_Eth_light continuous | Ethnic inequality: pop-weighted light Gini | Population-weighted light-Gini computed across ethnic homelands | Light Gini across ethnic homelands (method of Alesina et al. 2016) | 0-1 | GREG (Weidmann et al. 2010) + NOAA/NGDC | 173 ctry · N=3,528 |
Polity2 continuous | Polity IV democracy-autocracy score (-1..+1) | Rescaled Polity IV combined democracy-autocracy score | Polity IV combined score rescaled -1 (autocracy) to +1 (democracy) | -1..+1 | Polity IV (Center for Systemic Peace) | 157 ctry · N=3,158 |
fedelupd2 dummy | Federal-state dummy (1=federal) | 1 if the country is federally organised | Federalism indicator from the authors' archive | 0/1 | Authors' replication archive | 1992-2009 · 154 ctry · N=2,724 |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
Country_NAME | – | 100% | 3,675 | 180 | — | — | — | — | — |
Country_ISO | – | 100% | 3,675 | 180 | — | — | — | — | — |
year | – | 100% | 3,675 | 21 | 1992 | 2002.1 | 2002 | 2012 | 6.03 |
GINIW_pred_GDP_pc | 100% | 3,675 | 3,674 | 0.001 | 0.064 | 0.061 | 0.163 | 0.033 | |
GDP_pc_Country | 100% | 3,675 | 3,675 | 126.4 | 12,572 | 6,864.4 | 119,068 | 15,364 | |
Pop_Country | 100% | 3,675 | 3,675 | 2,690.0 | 34,542,604 | 7,563,883 | 1,353,431,168 | 126,525,234 | |
Resources_rents_share_of_GDP | 99% | 3,620 | 3,454 | 0 | 9.93 | 3.48 | 100.4 | 15.00 | |
Arable_land | 98% | 3,603 | 2,492 | 4.31e-04 | 0.148 | 0.105 | 0.661 | 0.138 | |
Trade_GDP_share | 95% | 3,509 | 3,509 | 0.003 | 0.841 | 0.761 | 5.32 | 0.469 | |
FDI_share_of_GDP | 95% | 3,477 | 3,476 | -0.829 | 0.044 | 0.025 | 4.31 | 0.105 | |
area | 100% | 3,675 | 180 | 50.00 | 728,582 | 155,360 | 16,380,084 | 1,922,363 | |
price_gasoline | 37% | 1,366 | 834 | 0.019 | 0.882 | 0.844 | 2.35 | 0.417 | |
Aid | 81% | 2,964 | 2,914 | -1,218,120,000 | 557,102,709 | 265,405,000 | 25,985,650,000 | 977,685,466 | |
School_enrollment_secondary | 70% | 2,566 | 2,566 | 5.16 | 74.19 | 82.79 | 160.6 | 31.42 | |
GINIW_Eth_light | 96% | 3,528 | 3,178 | 0 | 0.273 | 0.200 | 0.830 | 0.256 | |
Polity2 | 86% | 3,158 | 21 | -1.00 | 0.341 | 0.600 | 1.00 | 0.656 | |
fedelupd2 | 74% | 2,724 | 2 | 0 | 0.138 | 0 | 1.00 | 0.345 |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
code_Coutry_Region identifier | Numeric region key (orig. spelling 'Coutry' kept) | Numeric identifier for a region (unique within country) | Region identifier carried verbatim from the authors' archive | integer | Authors' replication archive | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
Latitude continuous | Region centroid latitude (degrees) | Latitude of the region polygon centroid | GADM polygon centroid | degrees | GADM (Global Administrative Areas) | region frame (Table_B4) |
Longitude continuous | Region centroid longitude (degrees) | Longitude of the region polygon centroid | GADM polygon centroid | degrees | GADM (Global Administrative Areas) | region frame (Table_B4) |
log_GDP_pc_Region continuous | Log observed regional GDP per capita | Natural log of GDP_pc_Region | ln(GDP_pc_Region) | log US$ | Gennaioli et al. (2014) | 1992-2010 · 1,504 reg (81 ctry) · region frame |
log_Light_ppix_Region continuous | Log avg nighttime light per pixel (region) | Natural log of the region mean DMSP-OLS stable-lights digital number | ln(mean DN); mean set to 0.01 when 0 so the log is defined; DN ranges 0-63 | log DN | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_1 dummy | Satellite/sensor-era dummy 1 (of 7) | 1 for DMSP satellite/sensor configuration era 1 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_2 dummy | Satellite/sensor-era dummy 2 (of 7) | 1 for DMSP satellite/sensor configuration era 2 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_3 dummy | Satellite/sensor-era dummy 3 (of 7) | 1 for DMSP satellite/sensor configuration era 3 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_4 dummy | Satellite/sensor-era dummy 4 (of 7) | 1 for DMSP satellite/sensor configuration era 4 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_5 dummy | Satellite/sensor-era dummy 5 (of 7) | 1 for DMSP satellite/sensor configuration era 5 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_6 dummy | Satellite/sensor-era dummy 6 (of 7) | 1 for DMSP satellite/sensor configuration era 6 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
satyear_7 dummy | Satellite/sensor-era dummy 7 (of 7) | 1 for DMSP satellite/sensor configuration era 7 | Sensor-era indicator; DMSP sensors change and age over 1992-2010 | 0/1 | NOAA/NGDC DMSP-OLS stable lights | 1992-2010 · 1,504 reg (81 ctry) · region frame |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
code_Coutry_Region | – | 100% | 5,258 | 1,504 | — | — | — | — | — |
Country_ISO | – | 100% | 5,258 | 81 | — | — | — | — | — |
year | – | 100% | 5,258 | 19 | 1992 | 2002.2 | 2000 | 2010 | 5.51 |
Latitude | 100% | 5,258 | 1,504 | -54.33 | 29.81 | 38.45 | 69.95 | 24.33 | |
Longitude | 100% | 5,258 | 1,504 | -156.4 | 22.45 | 23.12 | 163.0 | 67.31 | |
log_GDP_pc_Region | 100% | 5,258 | 5,161 | 5.42 | 9.10 | 9.08 | 11.92 | 1.06 | |
log_Light_ppix_Region | 100% | 5,258 | 5,184 | -4.61 | 0.957 | 1.25 | 4.14 | 1.77 | |
satyear_1 | 100% | 5,258 | 2 | 0 | 0.016 | 0 | 1.00 | 0.125 | |
satyear_2 | 100% | 5,258 | 2 | 0 | 0.004 | 0 | 1.00 | 0.062 | |
satyear_3 | 100% | 5,258 | 2 | 0 | 0.230 | 0 | 1.00 | 0.421 | |
satyear_4 | 100% | 5,258 | 2 | 0 | 0.022 | 0 | 1.00 | 0.146 | |
satyear_5 | 100% | 5,258 | 2 | 0 | 0.251 | 0 | 1.00 | 0.434 | |
satyear_6 | 100% | 5,258 | 2 | 0 | 0.246 | 0 | 1.00 | 0.431 | |
satyear_7 | 100% | 5,258 | 2 | 0 | 0.032 | 0 | 1.00 | 0.176 |
Variable dictionary
| Variable | Label | Definition | Construction | Units | Source | Coverage |
|---|---|---|---|---|---|---|
Country_ISO identifier | Country code (ISO 3166-1 alpha-3) | Three-letter country identifier | Assigned per country | string | GADM (Global Administrative Areas) | all files |
Country_NAME identifier | Country name | Country name (English) | From GADM country attributes | string | GADM (Global Administrative Areas) | country/region files |
year year | Calendar year | Year of observation | - | year | - | per file (see summary) |
GINIW_pred_GDP_pc continuous | Pop-weighted regional Gini (predicted income) | Population-weighted Gini of predicted regional income within a country-year | Gini of pred_GDP_pc_Region across regions, weighted by Pop_Region, per country-year | 0-1 | This study (derived) | 1992-2012 · 180 ctry · country frame |
Giniall continuous | National interpersonal income Gini (0-100) | Household-survey interpersonal income Gini | Reported household income Gini on a 0-100 scale (note: regional indices are 0-1) | 0-100 | Lessmann & Seidel (2017) | 1992-2012 · 153 ctry · N=1,330 (Figure_5) |
Distribution & statistics (click a header to sort)
| Variable | Distribution | Coverage | N | Distinct | Min | Mean | Median | Max | SD |
|---|---|---|---|---|---|---|---|---|---|
Country_ISO | – | 100% | 3,675 | 180 | — | — | — | — | — |
Country_NAME | – | 100% | 3,675 | 180 | — | — | — | — | — |
year | – | 100% | 3,675 | 21 | 1992 | 2002.1 | 2002 | 2012 | 6.03 |
GINIW_pred_GDP_pc | 100% | 3,675 | 3,674 | 0.001 | 0.064 | 0.061 | 0.163 | 0.033 | |
Giniall | 36% | 1,330 | 384 | 17.50 | 39.55 | 37.80 | 74.30 | 10.11 |
Known limitations & caveats
- DMSP top-coding / saturation. Digital numbers cap at 63, so the brightest
city cores are censored;
log_N_pix_top_cod_1_ppixcontrols for this. - Sensor drift. Six+ DMSP sensors span 1992–2010 with differing
calibration and aging; the
satyear_1…satyear_7dummies absorb era effects. - Sparse determinants. Several Table 4 determinants are observed for far
fewer country-years than the core panel (e.g.
price_gasolineN=1,366,School_enrollment_secondaryN=2,566), so those regressions run on shifting subsamples — compare descriptively. The licensed ICRG "bureaucratic quality" index is not included. - Scale of
Giniall. The interpersonal income Gini is on a 0–100 scale, whereas the regional indices are on a 0–1 scale — do not mix them without rescaling. code_Coutry_Regionspelling. The misspelling "Coutry" is kept so the key matches the authors' original replication archive.