Predicting subnational GDP in Vietnam with remote sensing data: A machine learning approach
Hussein Suleiman, Minh-Thu Thi Nguyen, Carlos Mendez
March 2025
Abstract
Official subnational Gross Domestic Product (GDP) data in Vietnam has been available only since 2010, hindering the analysis of long-term dynamics of local development. Based on remote sensing data and machine learning methods, we construct a subnational GDP indicator for the 63 Vietnamese provinces from 1992 to 2009. Specifically, we rely on nighttime lights (NTL), agricultural land, and climate datasets and employ six machine learning algorithms to construct the GDP dataset. We compare the accuracy of several machine learning algorithms and compare the predicted subnational GDP of the best-performing algorithm using two nighttime lights datasets. We show consistent predictions using both datasets, and construct the subnational GDP dataset using the NTL data with the longer temporal coverage. This new dataset allows researchers and policymakers to analyze long-term economic trends at the subnational level in Vietnam, filling a critical gap in historical economic data.
Publication
Letters in Spatial and Resource Sciences
๐ค AI Podcast Summary
๐ฐ๏ธ Introduction & Context
- Challenge: Limited subnational GDP data in Vietnam before 2010
- Need: Long-term data for economic development analysis
- Solution: Predict GDP using remote sensing & machine learning
๐ Data Sources Used
- Official GDP data (2010-2020)
- Nighttime Lights (NTL): Harmonized DMSP & VIIRS-like datasets
- Agricultural land data (ESA)
- Climate data: Temperature & precipitation (CRU)
๐ง Machine Learning Approach
- Six algorithms compared:
- Artificial Neural Networks (ANN)
- Random Forest (RF)
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN)
- Ridge Regression
- eXtreme Gradient Boosting (XGBoost)
๐ฆ Key Findings
- Predictions consistent across different nighttime datasets
- Ridge Regression chosen for final model
- Important features: Temperature & Agricultural Land more influential than NTL
๐ Application & Significance
- Created GDP data from 1992-2009
- Enables detailed long-term analysis of regional economic trends
- Assists policymakers and researchers in addressing regional inequality and growth
โ ๏ธ Limitations
- Remote sensing measurement/calibration discrepancies
- Dependence on official GDP benchmarks
- Interpretability challenges of machine learning methods
๐ Future Research Directions
- Explore additional remote sensing datasets
- Estimate broader socioeconomic indicators
- Improve models with larger datasets
๐ฏ Conclusion
- Machine learning + Remote sensing effectively address subnational data gaps
- New dataset supports informed economic policy decisions
- Potentially replicable method for other developing countries

Associate Professor of Development Economics
My research interests focus on the integration of development economics, spatial data science, and econometrics to understand and inform the process of sustainable development across regions.