`optim.LoadStateDict` from existing StateDict doesn't clone tensors #1172

shaltielshmid · 2023-12-06T16:15:56Z

When I call optim.LoadStateDict from a state dictionary of an exiting optimizer, the tensors are copied by reference, so if the old optimizer gets disposed the tensors are invalid.

Sample code:

var lin1 = torch.nn.Linear(10, 10);

var optim1 = torch.optim.Adam(lin1.parameters());
var optim2 = torch.optim.Adam(lin1.parameters());
optim2.load_state_dict(optim1.state_dict());
optim1.Dispose();

torch.nn.functional.mse_loss(lin1.call(torch.rand(10)), torch.rand(10)).backward();
optim2.step();

Throws:

System.InvalidOperationException: 'Tensor invalid -- empty handle.'

The text was updated successfully, but these errors were encountered:

shaltielshmid · 2023-12-06T17:31:29Z

This also causes issues with devices, if you copy a state dict from an optimizer that was on a different device.

For example:

var lin1 = torch.nn.Linear(10, 10);
var optim1 = torch.optim.Adam(lin1.parameters());
var sd = optim1.state_dict();

lin1.cuda();
var optim2 = torch.optim.Adam(lin1.parameters());
Console.WriteLine((optim2.state_dict().State[0] as Adam.State).exp_avg.device.type); // CUDA
optim2.load_state_dict(sd);
Console.WriteLine((optim2.state_dict().State[0] as Adam.State).exp_avg.device.type); // CPU

shaltielshmid mentioned this issue Dec 6, 2023

Fix optimizer load state dict copy tensor by reference #1173

Merged

NiklasGustafsson closed this as completed in #1173 Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`optim.LoadStateDict` from existing StateDict doesn't clone tensors #1172

`optim.LoadStateDict` from existing StateDict doesn't clone tensors #1172

shaltielshmid commented Dec 6, 2023 •

edited

Loading

shaltielshmid commented Dec 6, 2023

optim.LoadStateDict from existing StateDict doesn't clone tensors #1172

optim.LoadStateDict from existing StateDict doesn't clone tensors #1172

Comments

shaltielshmid commented Dec 6, 2023 • edited Loading

shaltielshmid commented Dec 6, 2023

`optim.LoadStateDict` from existing StateDict doesn't clone tensors #1172

`optim.LoadStateDict` from existing StateDict doesn't clone tensors #1172

shaltielshmid commented Dec 6, 2023 •

edited

Loading