-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve date handling for data pipeline #76
Conversation
If no match is found for a year, others are being tried until a match is found or all years have been tested
Also added tile size increase to 512x512 pixels to this PR, ref #78 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one typo, otherwise should be good.
scripts/datacube.py
Outdated
pixels = [part.compute() for part in pixels] | ||
print(f"Starting algorithm for MGRS tile {tile['name']} with index {index}") | ||
|
||
# Shuffle years, use index as seed for reproducability but no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Shuffle years, use index as seed for reproducability but no | |
# Shuffle years, use index as seed for reproducibility but no |
Like this the tile IDs in the file names should be consistent across dates.
Got this to work on batch and gave good results. Kicking off a new batch run as we speak, if all goes well we'll have 10x the data in a few hours 🤞🏽 |
Good timing! I'm hoping to kick off a new training run with Soumya's code later, and can test things out on the new data batch. |
* Improve date handling for data pipeline If no match is found for a year, others are being tried until a match is found or all years have been tested * Increase tile size to 512x512 pixels. Closes #78 * Increase dates per location to 3 Closes #79 * Prevent printing s3 sync upload progress logs * Move counter above cloud filter to ensure index consistency Like this the tile IDs in the file names should be consistent across dates. * Fix typo in comment * Update batch run setup to new bucket name
If no match is found for a year, others are being tried until a match is found or all years have been tested
Closes #68