New directions (Ideas)
New datasets to consider
Using Internet speed as a proxy for a region’s development
- Use Internet speeds for a location to estimate the development in the area (Ookla provides open source datasets for the same)
Using Openstreetmap data and its attributes
- For a given location / area using Openstreet Map data it is possible to derive the following information, which can be used to estimate wealth
- No. of buildings
- No. of roads
- No. of primary roads
- No. of trunk roads
- count of market places
- Number of charging stations
- No. of post offices
- Supermarket counts
- Car repair counts
- Department stores
- Computers
- Playgrounds
- Monuments
New Modelling approaches to consider
GAM Models
- Using explainable models - How much a variable is influencing the predictions
- Explainability along with good prediction accuracy. Example - Explainable boosting machines (EBMs)
- We can build editable models using EBMs
Using Semi-supervised learning
- Use libraries like snorkel to generate labels
- Low confidence images can be routed to humans for labeling
Using Bayesian Updates
- There is correlation between wealth and spatial location
- A Gaussian process can be implemented on top of the model. Use the prior observations to get the posterior distribution from the priors (tried at stanford in 2016)
Time series Analysis
- Using time series analysis to forecast the wealth from past surveys