Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected parameters #1332

Open
lintao185 opened this issue Jun 15, 2024 · 11 comments
Open

Unexpected parameters #1332

lintao185 opened this issue Jun 15, 2024 · 11 comments

Comments

@lintao185
Copy link

image
Here are two Tensor fields, which are just regular fields. However, after calling RegisterComponents(), TorchSharp will turn these two fields into parameters. Could you add an attribute to mark a field so that it won't be registered as a parameter?

@yueyinqiu
Copy link
Contributor

Did you mean buffers? I have also noticed that when reading the relevant codes.

It seems to be a intentional design. Perhaps PyTorch used to behavior like this?

I think we should completely reconsider about RegisterComponents as we have also discussed here #1272 (comment).

@yueyinqiu
Copy link
Contributor

yueyinqiu commented Jun 15, 2024

A work around currently could be:

using TorchSharp;
using static TorchSharp.torch;

var m = new M();
Console.WriteLine(m.parameters().Count()); // 0
Console.WriteLine(m.buffers().Count()); // 0

class M : nn.Module<int, int>
{
    private readonly Tensor tensor = torch.zeros([]);

    public M() : base(nameof(M))
    {
        this.RegisterComponents();
        this._internal_buffers.Remove(nameof(tensor));
    }

    public override int forward(int input)
    {
        return input;
    }
}

@NiklasGustafsson
Copy link
Contributor

NiklasGustafsson commented Jun 17, 2024

Great issue. What are the some of the situations where you need a module to have tensor members, but they're not buffers that become part of the state_dict?

@yueyinqiu
Copy link
Contributor

yueyinqiu commented Jun 18, 2024

I think the point is that we should always allow users to decide it, instead of implicitly forcing to do (or not to do) something.

And that's also why I suggested to remove the call of RegisterComponents in register_module. RegisterComponents could do any thing, but it should depends on the users to decide whether it should be called.

Of course, users could always override RegisterComponents in their own modules. In that case, the default implemention is not that important actually, and the only reason to do the change is to make it consistent with PyTorch.

@yueyinqiu
Copy link
Contributor

A practical situation might be to save some constants or "cached" tensors to speed it up?

@lintao185
Copy link
Author

A practical situation might be to save some constants or "cached" tensors to speed it up?

Yes

@NiklasGustafsson
Copy link
Contributor

Okay. Here's what I think should be done then:

  1. Any field that is of type 'Parameter' will be a parameter.
  2. Any field of type 'Module' will be a child module.
  3. A field of type 'Tensor' will be a buffer iff it is decorated with an attribute 'Buffer'.

#3 will be a breaking chance, but it makes more sense to me to require tensor fields to be decorated if they are buffers than if they are not (which would also be breaking, but break fewer modules, I believe).

@yueyinqiu
Copy link
Contributor

yueyinqiu commented Jun 18, 2024

As far as I know, buffers are always manually registered in PyTorch. So perhaps we don't have to provide the automatic registration for it?

Of course adding this feature would be better. We don't have to be consistent with PyTorch in every detail. Right?

What I'm hesitating about is that if we do so, there seems to be no reason not to do the same for parameters and modules as well (especially modules, who knows whether a custom type would be a module or not, haha). (But maybe we could conversely use [IgnoreComponent] or something else because PyTorch do register them by default.)

@NiklasGustafsson
Copy link
Contributor

NiklasGustafsson commented Jun 18, 2024

there seems to be no reason not to do the same for parameters and modules as well (especially modules, who knows whether a custom type would be a module or not, haha)

I fail to see a scenario where something that is a Parameter field is not meant to be treated as a parameter. Same with modules -- it's not a module unless it's derived from Module, and if it is, what scenario would not have it as a submodule?

As far as I know, buffers are always manually registered in PyTorch. So perhaps we don't have to provide the automatic registration for it?

It's going to be a breaking change, no matter what. So, my question is -- how bad is it that buffers are automatically registered?

@yueyinqiu
Copy link
Contributor

yueyinqiu commented Jun 18, 2024

I believe it's just a kind of taste, and the only reason to do the change is to make it consistent with PyTorch.

But that could be a sufficient reason, since it may confuse those who is more familiar with PyTorch or who is converting a PyTorch module into TorchSharp.

@asieradzk
Copy link

Great issue. What are the some of the situations where you need a module to have tensor members, but they're not buffers that become part of the state_dict?

RNN where I'd like to cache state tensors for the next forward pass. Module itself is logical place to have the cache.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants