Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback function to log Masked Autoencoder reconstructions to WandB #88

Merged
merged 16 commits into from
Dec 15, 2023

Conversation

weiji14
Copy link
Contributor

@weiji14 weiji14 commented Dec 13, 2023

To visually inspect how the Masked Autoencoder is performing over the training run in terms of reconstructing to original image.

Implemented as a Lightning Callback function which runs at the end of a validation loop's mini-batch (on_validation_batch_end). A sample of 6 image pairs (original + reconstructed, so total of 12) are uploaded online.

Example usage with LightningCLI:

Samples of histogram equalized RGB images, and the reconstructed outputs (only random noise as this is early in the training).

image

python trainer.py fit --trainer.max_epochs=2 \
                      --trainer.precision=bf16-mixed \
                      --data.data_path=data/32VLM \
                      --data.num_workers=4 \
                      --trainer.logger=WandbLogger \
                      --trainer.logger.project=clay \
                      --trainer.logger.save_dir=checkpoints \
                      --trainer.callbacks+=LogMAEReconstruction \

TODO:

  • Add wandb dependency
  • Initial implementation to upload RGB images
  • Do proper histogram equalization of the images
  • Add a unit test

TODO in the future:

  • Upload SAR and DEM images

References:

A CLI and library for interacting with the Weights and Biases API!
Created a custom callback function to log visualizations of the input and output images to the Masked Autoencoder. Only showing the RGB bands of Sentinel-2 for now. A sample of 6 image pairs (original + reconstructed, so 12 in total) is uploaded to Weights and Biases.

Example LightningCLI command: `python trainer.py fit --trainer.max_epochs=20 --data.data_path=data/32VLM --trainer.logger=WandbLogger --trainer.logger.project=clay --trainer.logger.save_dir=checkpoints --trainer.callbacks+=LogMAEReconstructedImage`.
@weiji14 weiji14 self-assigned this Dec 13, 2023
Image processing in Python!
Enhance low contrast images by applying a histogram equalization stretching algorithm on the RGB images, instead of dividing by a magic number like 6000.
More samples to look at! Also only running einsum conversion on as many samples as needed rather than the whole batch, and handling cases where num_samples may be more than the batch_size.
Allows for `from src.callback_wandb import LogMAEReconstruction` to run, even without wandb being installed. Helpful if someone doesn't want to install wandb for whatever reason.
Testing that the LogMAEReconstruction callback works to save a set of images to WandB. Testing this in offline mode only, with checks that artifacts are saved locally, and that the wandb images have the correct caption and format.
Order of the folders could change, so using set instead of list.
@weiji14 weiji14 marked this pull request as ready for review December 14, 2023 02:18
Unsure why the unit test passes on GitHub Actions, but causes an `Error: Process completed with exit code 255`. Maybe a smaller batch size would help?
Turn off stdout / stderr logging by setting WANDB_CONSOLE=off to see if it helps with the failing GitHub Actions.
Another attempt to see if it helps prevent exit code 255 on GitHub Actions.
Running out of ideas on why pytest has exit code 255 on GitHub Actions...
Trying to figure out what's going on.
Setting WANDB_MODE="disabled", so no files are logged to disk, though the wandb.Image(s) are still created. See if this helps to resolve the exit code 255 issue on GitHub Actions.
Minor changes to the docstring of the on_validation_batch_end method, and a typo fix.
Copy link
Contributor Author

@weiji14 weiji14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sunk wayyy too much time trying to debug why the unit test was failing on GitHub Actions but not locally (see below), so didn't get to do the SAR and DEM plots. Will implement those in a follow-up PR instead.

Comment on lines +49 to +58
# Check that wandb saved some log files to the temporary directory
# assert os.path.exists(path := f"{tmpdirname}/wandb/latest-run/")
# assert set(os.listdir(path=path)) == set(
# [
# f"run-{trainer.logger.version}.wandb",
# "tmp",
# "files",
# "logs",
# ]
# )
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting WANDB_MODE="disabled" in this unit test and commenting these lines that check for log files being saved, because GitHub Actions keeps failing with an error like Error: Process completed with exit code 255, even though all the unit tests pass on pytest 😕 The unit test does work locally though when I uncomment this block and use WANDB_MODE="offline", so not sure what's going on.

@weiji14
Copy link
Contributor Author

weiji14 commented Dec 15, 2023

Gonna merge this in first and combine with the other wandb callback being developed at #47.

@weiji14 weiji14 merged commit e259165 into main Dec 15, 2023
2 checks passed
@weiji14 weiji14 deleted the callbacks/wandb branch December 15, 2023 01:51
brunosan pushed a commit that referenced this pull request Dec 27, 2023
…88)

* ➕ Add wandb

A CLI and library for interacting with the Weights and Biases API!

* 🔊 Log Masked Autoencoder reconstructions to WandB

Created a custom callback function to log visualizations of the input and output images to the Masked Autoencoder. Only showing the RGB bands of Sentinel-2 for now. A sample of 6 image pairs (original + reconstructed, so 12 in total) is uploaded to Weights and Biases.

Example LightningCLI command: `python trainer.py fit --trainer.max_epochs=20 --data.data_path=data/32VLM --trainer.logger=WandbLogger --trainer.logger.project=clay --trainer.logger.save_dir=checkpoints --trainer.callbacks+=LogMAEReconstructedImage`.

* ➕ Add scikit-image

Image processing in Python!

* 📸 Apply histogram equalization to RGB images

Enhance low contrast images by applying a histogram equalization stretching algorithm on the RGB images, instead of dividing by a magic number like 6000.

* 🔧 Increase default sample size from 6 to 8

More samples to look at! Also only running einsum conversion on as many samples as needed rather than the whole batch, and handling cases where num_samples may be more than the batch_size.

* 🧑‍💻 Make wandb a somewhat optional dependency

Allows for `from src.callback_wandb import LogMAEReconstruction` to run, even without wandb being installed. Helpful if someone doesn't want to install wandb for whatever reason.

* ✅ Add unit test for LogMAEReconstruction

Testing that the LogMAEReconstruction callback works to save a set of images to WandB. Testing this in offline mode only, with checks that artifacts are saved locally, and that the wandb images have the correct caption and format.

* 🐛 Compare expected folders using set instead of list

Order of the folders could change, so using set instead of list.

* 🧪 Prevent WandB logger from saving logs to local drive for now

Setting WANDB_MODE="disabled", so no files are logged to disk, though the wandb.Image(s) are still created. See if this helps to resolve the exit code 255 issue on GitHub Actions.

* 📝 Fix a typo and improve docstring

Minor changes to the docstring of the on_validation_batch_end method, and a typo fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant