Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backbone/model weights source should be more obvious to figure out. #543

Closed
jpcbertoldo opened this issue Sep 6, 2022 · 11 comments
Closed

Comments

@jpcbertoldo
Copy link
Contributor

jpcbertoldo commented Sep 6, 2022

I saw torchvision in the requirements and immediately assumed that the pre-trained model weights would come from there (I didn't know timm until very recently).

I saw they are actually coming from timm and realized it's not a well documented information.

I have two suggestions:

  1. (easy) mention it in the readme.md and add it to the docs.

  2. (maybe easy?) make the pre-trained weights source a parameter that would show up in the config file (it could come from torchvision for instance)

@jpcbertoldo
Copy link
Contributor Author

I'd be happy to do both but perhaps can the core contributors validate/reject the idea?

@samet-akcay
Copy link
Contributor

@jpcbertoldo, torchvision was previously the base package for the CNN backbones and feature extraction. Recently we made this switch to timm as it has a feature extractor, which even works for number of transformers that are used in some of the algorithms in anomalib.

I'm not sure how easy Option-2 would be since FeatureExtractor module might need a refactor to handle transformers.
In this case, Option-1 would be relatively easier, I think.

It might also be an idea to deprecate the FeatureExtractor module so we wouldn't need torchvision anymore? @djdameln, @ashwinvaidya17, do you guys remember, anywhere else we use torchvision?

@djdameln
Copy link
Contributor

djdameln commented Sep 6, 2022

We also use torchvision for its VisionDataset module, IMG_EXTENSIONS and transforms.

@samet-akcay
Copy link
Contributor

ah yes, of course.

@samet-akcay
Copy link
Contributor

samet-akcay commented Sep 6, 2022

Looks like it is also used in reverse distillation from torchvision.models.resnet import conv1x1, conv3x3 and from torchvision.models.resnet import BasicBlock, Bottleneck

@jpcbertoldo
Copy link
Contributor Author

Recently we made this switch to timm as it has a feature extractor, ...

        self.feature_extractor = timm.create_model(
            backbone,
            pretrained=pre_trained,
            features_only=True,
            exportable=True,
            out_indices=self.idx,
        )

It's this features_only=True you are talking about?

....FeatureExtractor module might need a refactor to handle transformers.

Okay, probably not the way to go then :)

It might also be an idea to deprecate the FeatureExtractor

image

It is used in many places apparently.

If I got it right FeatureExtractor is just wrapping timm's model so the mapping of features extracted is str -> Tensor instead of int -> Tensor, right?

To me it sounds like a good idea to keep it.


I guess option 1 (add some mention to timm in readme and docs) would do.

@djdameln
Copy link
Contributor

djdameln commented Sep 6, 2022

I wouldn't be in favor of removing torchvision because it provides several useful features besides pretrained model weights.

Also, If I remember correctly, many timm models actually obtain their pretrained weights directly from torchvision. Therefore I believe that in most cases getting the pretrained weights from torchvision would give us the same weights as when getting the weights from timm. So I feel the second option would add unnecessary complexity to the library.

However I do agree that it would be a good idea to mention the source of the pre-trained weights either in the readme or documentation or both.

@samet-akcay
Copy link
Contributor

It might also be an idea to deprecate the FeatureExtractor

My bad, forget about it. I just forgot that we already refactored FeatureExtractor to use timm's feature extraction capabilities. We cannot deprecate it. As you said, we just need to document it well.

I also agree with @djdameln. Removing torchvision might lead to some issues. Safest option would be to improve the documentation to let the users know that feature extraction is done by timm.

@jpcbertoldo
Copy link
Contributor Author

Ok, so just some documentation update.

@jpcbertoldo
Copy link
Contributor Author

(asking this here but i will create a proper issue if necessary)

context: FeatureExtractor

The __init__ will call _map_layer_to_idx() to get the timm-specific indices of the submodules given by the argument layers: List[str].

_map_layer_to_idx will compensate the misalignment between the indices from timm and features.named_children with the argument offset, and this function's docstring says

...please update offset based on need

I believe there are two problems here (can you confirm?):

a. the user cannot adjust offset

b. there is no validation that the offset was incorrect, so the bug would be silent

I think b could be solved with something like

# ...
self.idx = self._map_layer_to_idx()
self.feature_extractor = timm.create_model(
    backbone,
    pretrained=pre_trained,
    features_only=True,
    exportable=True,
    out_indices=self.idx,
)
assert tuple(self.feature_extractor.feature_info.module_name()) == tuple(self.layers), "..."
# ...

@samet-akcay
Copy link
Contributor

Addressed with #576

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants