Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downloading only the cloud mask #14

Open
daviddemeij opened this issue Dec 23, 2019 · 4 comments
Open

downloading only the cloud mask #14

daviddemeij opened this issue Dec 23, 2019 · 4 comments

Comments

@daviddemeij
Copy link

Is there any possibility to only download the cloud masks from the Maja processed files? This would save a lot of downloading since we are planning to use this as a ground truth source of our own single-image cloud mask (so we don't need to process sequences to get a good cloud mask or for areas where we don't have an SRTM).

@olivierhagolle
Copy link
Owner

olivierhagolle commented Feb 13, 2020

Hi David,
sorry for the late reply, posting a message 2 days before Christmas is not a good idea, as far as I am concerned ;)

I do not know how to do it, but I know it is possible to unzip a zip file through https, with some python commands. The guys in Sinergise did it to download all our cloud masks to use them as learning references for their own one.

If you find a way, I am interested
Olivier

@chauvenne
Copy link

@daviddemeij Did you find a way to achieve this?

@daviddemeij
Copy link
Author

No I did not, in the end, we just downloaded the whole file unzipped and disregarded the other bands.

@chauvenne
Copy link

chauvenne commented Nov 4, 2020

I found some way to do this. I first tried using GDAL /vsizip/ and /vsicurl/ but did not manage to make it work.

I then tried using Python requests module. I make a first request with the following header: headers = { "Authorization": "Bearer $YOUR_THEIA_TOKEN", "User-Agent": "python-requests" }
and with stream=True so that we do not start the download.

With the response, you can get the zip size and then fetch the "Central-Directory", giving you the full structure of it by downloading only a few bytes. See here on how to do that: https://stackoverflow.com/a/54222447 (this is for S3, but can easily be adapted to requests).

Once you have the structure, you can find the bytes range of the file you want, add this range in the headers and re-do the request. You finally save the obtained bytes into a zip file and fix it using zip -FF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants