Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[StableDiffusionInpaintPipeline] accept tensors for init and mask image #439

Merged
merged 4 commits into from
Sep 16, 2022

Conversation

patil-suraj
Copy link
Contributor

@patil-suraj patil-suraj commented Sep 9, 2022

This PR updates StableDiffusionInpaintPipeline to accept both torch.FloatTensor and PIL.Image.Image for init_image and mask_image.

Fixes #370

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Sep 9, 2022

The documentation is not available anymore as the PR was closed or merged.

@Inkorak
Copy link

Inkorak commented Sep 9, 2022

@patil-suraj There are problem here, in my opinion. If we send a FloatTensor, then there will be no preprocessing and the mask variable will not be declared, and secondly, there will be no transfer to the device.

if not isinstance(mask_image, torch.FloatTensor):
   mask = preprocess_mask(mask_image).to(self.device)
mask = torch.cat([mask] * batch_size)

With this, there should be no problems:

if not isinstance(mask_image, torch.FloatTensor):
   mask_image = preprocess_mask(mask_image)
mask = torch.cat([mask_image.to(self.device)] * batch_size)

@patil-suraj
Copy link
Contributor Author

good catch! Updating it now

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Question: if the user provides an image and a mask as tensors, should we verify that the number of channels match? The image should have R,G,B while the mask is only L. Is this something we should check or is it too much?

On second thought, we could just update the documentation and not the code:

mask_image (`torch.FloatTensor` or `PIL.Image.Image`):
    `Image`, or tensor representing an image batch, to mask `init_image`. White pixels in the mask will be
    replaced by noise and therefore repainted, while black pixels will be preserved. If `mask_image` is a
    PIL image, it will be converted to a single channel (luminance) before use. If it's a tensor, it should
    contain one color channel (L) instead of 3, so the expected shape would be `(B, H, W, 1)`.

@patil-suraj
Copy link
Contributor Author

Good point @pcuenca ! The pipeline is experimental and will be soon updated so just updated the docs for now.

@patil-suraj patil-suraj merged commit 06924c6 into main Sep 16, 2022
@patil-suraj patil-suraj deleted the update-inpaint branch September 16, 2022 15:35
PhaneeshB added a commit to nod-ai/diffusers that referenced this pull request Mar 1, 2023
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
…ge (huggingface#439)

* accept tensors

* fix mask handling

* make device placement cleaner

* update doc for mask image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lack of functionality in stable diffusion inpainting
5 participants