download satellite imagery #5

luckystarufo · 2020-07-01T00:52:08Z

Thanks for your nice work and the updates!

Two questions:

The original repo (from Neal) seems to download the satellite imagery using the Google Map Static API, in which they set the 'zoom' parameter as 16. Is there a correspondence on planet API? (i.e. how do we ensure we are downloading the same image sets?)
The reproduced R^2 is significantly lower than the ones reported in the original work. This happens even if I change your code to the ways that they compute the R^2 (i.e. metrics.r2_score --> spicy.stats.pearsonr()[0]**2). Though I realize you are using data sets from different years, any other ideas of why that's happening?

Thanks!

[A side note: from https://developers.google.com/maps/billing/gmp-billing, it looks like the Google Map Static API is NOT free even for the first 100K images now?]

jmathur25 · 2020-07-01T04:14:27Z

Hey again. To address your questions:

We are actually downloading different images than the original script. This is the original image download script: https://github.com/nealjean/predicting-poverty/blob/master/scripts/get_image_download_locations.py#L13. The method I use generates the same bounding box but is more generalized/consistent with image choice within that box. An older version of this repo used that original function, but I figured the new way was better. As for correspondence with the Planet API, I manually played around with zoom levels and I found anything above zoom=14 was too low quality. zoom=16 and Google's images are higher quality and resolution. But in my tests, I found switching to Planet did not make a huge difference.
The original paper is reporting the R2 values on the log expenditures. See the table in the README for replication comparison. Otherwise, there are some small differences between the paper and this reproduction that I've summarized below:

metrics.r2_score --> stats.pearsonr()[0]**2 in the evaluate_fold and find_best_alpha functions inside utils/ridge_training.py. If I do that, the Malawi r2 goes from 0.26 to 0.29 and the Nigeria r2 from 0.19 to 0.22 (these are the two countries shared with the replication; note this is on directly predicting the expenditures vs the log). Not a huge change, but also more than one might expect.
We use different years than the original paper. The data distribution does seem different over the years for these two countries (just by looking at the graph in papers/jean_et_al.pdf vs my own), so that could be anywhere from a small to a huge factor.
Some details of the training procedure / nightlights filtering procedure are not available in code. They were somewhat described in papers/aaai16.pdf, but the results discussed in that more detailed paper are different and I couldn't find more explanation.
The model I initialize is slightly different (they use a fully convolutional one).
Difference in preprocessing. The LSMS survey documentation for Malawi calls "rexpagg" and "rexpaggpc" both per capita consumptions, but as indicated by the name, "rexpaggpc" is actually per capita and "rexpagg" is per household. To compute the consumption per capita in a cluster, you need to sum (not average) the consumptions per household, then divide by the total number of people surveyed in the cluster. Jean et al. does this differently by averaging the households instead of summing Also, they use an adult equivalent adjustment whereas I do per capita.

Given all of this, once I got the results I did here I figured it was "close enough". As for the last point, dang that really sucks. That is definitely a recent change. Being able to do this almost instantly at very little cost was huge. :(

luckystarufo · 2020-07-01T22:45:55Z

Thanks so much for the detailed explanations, now I kind of see how it goes ... these are very insightful notes.

luckystarufo closed this as completed Jul 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

download satellite imagery #5

download satellite imagery #5

luckystarufo commented Jul 1, 2020

jmathur25 commented Jul 1, 2020 •

edited

Loading

luckystarufo commented Jul 1, 2020

download satellite imagery #5

download satellite imagery #5

Comments

luckystarufo commented Jul 1, 2020

jmathur25 commented Jul 1, 2020 • edited Loading

luckystarufo commented Jul 1, 2020

jmathur25 commented Jul 1, 2020 •

edited

Loading