-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dtype and scale conversion in V2 #7756
Comments
The way we have envisioned it, is to function similar to vision/test/test_transforms_v2_consistency.py Line 1246 in a6dea86
For example transforms.ToDtype(dtype=defaultdict(lambda: None, {datapoints.Mask: torch.int64}))
👍
IIRC, we did that to avoid silently missing to convert something. Like I stated above, our vision was for users to either specify everything or use a
In general, I'm onboard with this change. However, I'm not sure we need the extra options besides just the boolean flag. We only have logic to scale images and videos and I can't think of a way to scale the other datapoints. Like, what would a scaled mask even be? Maybe that is applicable to bool-ish detection masks, but certainly not for segmentation masks. Thus, I would just use |
Does that mean we also create a new dispatcher called vision/torchvision/transforms/v2/_misc.py Lines 253 to 257 in a6dea86
|
Sounds fine, OK
FWIW I find this to be a bad UX and it's really difficult to understand what's going on for a non-expert user. There's just too much to unpack. We should have a simpler way to let users specify "for all the rest, use 0". Even just I think we should either pass-through non-specified input OR have a much much easier way of specifying what happens to "all the rest". Happy to hear suggestions with the latter. I feel like allowing |
Our UX for converting Dtype and scales is bad and error-prone in V2.
In #7743 we have a sample with an Image and a Mask. We need to:
The only way I found to do that right now is:
We should at the very least:
ConvertDtype
andToDtype
. It should be absolutely clear thatConvertDtype
also scales the values.ConvertDtypeAndScale()
is a decent candidate. Any new name is probably an improvement on the status quo.(datapoints.Image if backend == "datapoint" else torch.Tensor): None,
isn't passed. If I'm not converting a type, I shouldn't have to pass it as input.Ideally I think we can get rid of
ConvertDtype
and just add ascale
parameter toToDtype()
:scale=False
means no scaling happensscale=True
means all transformed inputs are scaled into the range specified by their dtypescale=Image
means only Image instances are scaledscale=(Image, Mask)
means only Images and Masks instances are scaled.The text was updated successfully, but these errors were encountered: