Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dtype and scale conversion in V2 #7756

Closed
NicolasHug opened this issue Jul 24, 2023 · 3 comments · Fixed by #7759
Closed

Dtype and scale conversion in V2 #7756

NicolasHug opened this issue Jul 24, 2023 · 3 comments · Fixed by #7759

Comments

@NicolasHug
Copy link
Member

NicolasHug commented Jul 24, 2023

Our UX for converting Dtype and scales is bad and error-prone in V2.

In #7743 we have a sample with an Image and a Mask. We need to:

  • convert the image from uint8 to float and convert its scale from 0-255 to 0-1
  • convert the mask from uint8 to int64, without converting its scale

The only way I found to do that right now is:

            dtype= {
                (datapoints.Image if backend == "datapoint" else torch.Tensor): None,
                datapoints.Mask: torch.int64,
            }
            transforms += [T.ToDtype(dtype=dtype)]
            transforms += [T.ConvertDtype(dtype=torch.float)]

We should at the very least:

  • Clearly distinguish between ConvertDtype and ToDtype. It should be absolutely clear that ConvertDtype also scales the values. ConvertDtypeAndScale() is a decent candidate. Any new name is probably an improvement on the status quo.
  • Not error when (datapoints.Image if backend == "datapoint" else torch.Tensor): None, isn't passed. If I'm not converting a type, I shouldn't have to pass it as input.

Ideally I think we can get rid of ConvertDtype and just add a scale parameter to ToDtype():

  • scale=False means no scaling happens
  • scale=True means all transformed inputs are scaled into the range specified by their dtype
  • scale=Image means only Image instances are scaled
  • scale=(Image, Mask) means only Images and Masks instances are scaled.
@pmeier
Copy link
Contributor

pmeier commented Jul 25, 2023

The only way I found to do that right now is:

The way we have envisioned it, is to function similar to fill

PadIfSmaller(size=480, fill=defaultdict(lambda: 0, {datapoints.Mask: 255})),

For example

transforms.ToDtype(dtype=defaultdict(lambda: None, {datapoints.Mask: torch.int64}))
  • Clearly distinguish between ConvertDtype and ToDtype. It should be absolutely clear that ConvertDtype also scales the values. ConvertDtypeAndScale() is a decent candidate. Any new name is probably an improvement on the status quo.

👍

  • Not error when (datapoints.Image if backend == "datapoint" else torch.Tensor): None, isn't passed. If I'm not converting a type, I shouldn't have to pass it as input.

IIRC, we did that to avoid silently missing to convert something. Like I stated above, our vision was for users to either specify everything or use a defaultdict to indicate that only a few explicit types should be transformed. Fully agree that this is not properly documented right now.

Ideally I think we can get rid of ConvertDtype and just add a scale parameter to ToDtype():

  • scale=False means no scaling happens

  • scale=True means all transformed inputs are scaled into the range specified by their dtype

  • scale=Image means only Image instances are scaled

  • scale=(Image, Mask) means only Images and Masks instances are scaled.

In general, I'm onboard with this change. However, I'm not sure we need the extra options besides just the boolean flag. We only have logic to scale images and videos and I can't think of a way to scale the other datapoints. Like, what would a scaled mask even be? Maybe that is applicable to bool-ish detection masks, but certainly not for segmentation masks.

Thus, I would just use scale: bool = False as default and if True always apply to all images and videos that we find.

@pmeier
Copy link
Contributor

pmeier commented Jul 25, 2023

Does that mean we also create a new dispatcher called to_dtype with a scale parameter that "replaces" the old convert_dtype? Right now the former doesn't exists and we simply call .to on the input:

def _transform(self, inpt: Any, params: Dict[str, Any]) -> Any:
dtype = self.dtype[type(inpt)]
if dtype is None:
return inpt
return inpt.to(dtype=dtype)

@NicolasHug
Copy link
Member Author

Thus, I would just use scale: bool = False as default and if True always apply to all images and videos that we find.

Sounds fine, OK

fill=defaultdict(lambda: 0, {datapoints.Mask: 255}))

FWIW I find this to be a bad UX and it's really difficult to understand what's going on for a non-expert user. There's just too much to unpack. We should have a simpler way to let users specify "for all the rest, use 0". Even just defaultdict(lambda: None) is reasonably advanced Python and we shouldn't expect our users to use that kind of idiom just so that they can do the most common tasks of every pre-proc pipeline. Just imagine yourself in front of a class of 50 students teaching how to train a model and having to unpack this line - I don't want to be in these shoes.

I think we should either pass-through non-specified input OR have a much much easier way of specifying what happens to "all the rest". Happy to hear suggestions with the latter. I feel like allowing dtype={datapoints.Mask: torch.int64, "others": None} could already be an improvement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants