Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pytorch] bug the torch.sum() or tensor.sum() get the result shape is null! #1300

Closed
mullerhai opened this issue Jan 2, 2023 · 17 comments
Closed
Assignees
Labels

Comments

@mullerhai
Copy link

HI ,when I want to invoke the sum method to compute sum of tensor , but can not get the correct shape, and the shape is null,I don't know why

system : macos intel
jdk 15 8
scala 2.12.11
pytorch version: 1.13.0-1.5.9-SNAPSHOT

x shape is (500|32 )
val yx = torch.sum(x,new DimnameArrayRef(1)) 
println(s"FeaturesLinear x sum shape ${x.sum().shape().mkString("|")}  two ${x.sum(new DimnameArrayRef(1)).shape().mkString("|")} x shape ${x.shape().mkString("|")} sum yx shape ${yx.shape().mkString("|")}")

in console

FeaturesLinear x sum shape   two  x shape 500|32 sum yx shape 

in old pytorch-javacpp version, I remember the sum maybe can really work, why new version get the bad result ,is DimnameArrayRef class I use it with false operate?

@saudet
Copy link
Member

saudet commented Jan 2, 2023

I don't think that's the version of sum() you want to use. Try the one taking an OptionalIntArrayRef instead.

@HGuillemet
Copy link
Collaborator

Each time you call a constructor of a subclass of Pointer that takes an int as argument, you create a pointer to a C++ array of this length,un-initialiazed (or maybe with zeroes).
Why not simply using sum(1) ?

@mullerhai
Copy link
Author

Each time you call a constructor of a subclass of Pointer that takes an int as argument, you create a pointer to a C++ array of this length,un-initialiazed (or maybe with zeroes). Why not simply using sum(1) ?

in old pytorch-java version I can use sum(1) , but now the sum() method has change the input parameter type.

  public native @ByVal Tensor sum(@ByVal(nullValue = "c10::optional<at::ScalarType>(c10::nullopt)") ScalarTypeOptional dtype);
  public native @ByVal Tensor sum();
  public native @ByVal Tensor sum(@ByVal OptionalIntArrayRef dim, @Cast("bool") boolean keepdim/*=false*/, @ByVal(nullValue = "c10::optional<at::ScalarType>(c10::nullopt)") ScalarTypeOptional dtype);
  public native @ByVal Tensor sum(@ByVal OptionalIntArrayRef dim);
  public native @ByVal Tensor sum(@ByVal DimnameArrayRef dim, @Cast("bool") boolean keepdim/*=false*/, @ByVal(nullValue = "c10::optional<at::ScalarType>(c10::nullopt)") ScalarTypeOptional dtype);
  public native @ByVal Tensor sum(@ByVal DimnameArrayRef dim);

using sum(1) can not compile the process, and in my process sum() and x.sum(new OptionalIntArrayRef(1)) also can not get correct result ,

meet the error

Exception in thread "main" java.lang.RuntimeException: Dimension out of range (expected to be in range of [-2, 1], but got 123145498252608)
Exception raised from maybe_wrap_dims_n at /Users/runner/work/javacpp-presets/javacpp-presets/pytorch/cppbuild/macosx-x86_64/pytorch/aten/src/ATen/WrapDimUtils.h:68 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) + 81 (0x1084af991 in libc10.dylib)
frame #1: void at::maybe_wrap_dims<c10::SmallVector<long long, 5u> >(c10::SmallVector<long long, 5u>&, long long) + 276 (0x189959804 in libtorch_cpu.dylib)
frame #2: at::meta::resize_reduction(at::impl::MetaBase&, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::ScalarType) + 1289 (0x1899593d9 in libtorch_cpu.dylib)
frame #3: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>), &(at::(anonymous namespace)::wrapper_sum_dim_IntList(at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>))>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType> > >, at::Tensor (at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 84 (0x18a706df4 in libtorch_cpu.dylib)
frame #4: at::_ops::sum_dim_IntList::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 133 (0x18a36a8c5 in libtorch_cpu.dylib)
frame #5: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>), &(torch::autograd::VariableType::(anonymous namespace)::sum_dim_IntList(c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>))>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType> > >, at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 965 (0x18bee47e5 in libtorch_cpu.dylib)
frame #6: at::_ops::sum_dim_IntList::call(at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 363 (0x18a2d4bcb in libtorch_cpu.dylib)
frame #7: Java_org_bytedeco_pytorch_Tensor_sum__Lorg_bytedeco_pytorch_OptionalIntArrayRef_2 + 192 (0x123668e10 in libjnitorch.dylib)
frame #8: 0x0 + 4473391752 (0x10aa28a88 in ???)

	at org.bytedeco.pytorch.Tensor.sum(Native Method)
	at org.pytorch.layer.FeaturesLinear.forward(FeaturesLinear.scala:25)
	at org.pytorch.model.LogisticRegressionModel.forward(LogisticRegressionModel.scala:18)
	at org.pytorch.example.ObjMain$.$anonfun$main$1(main.scala:102)
	at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
	at org.pytorch.example.ObjMain$.main(main.scala:90)
	at org.pytorch.example.ObjMain.main(main.scala)

@saudet
Copy link
Member

saudet commented Jan 2, 2023

Sounds like you want the ScalarTypeOptional variant?

@mullerhai
Copy link
Author

mullerhai commented Jan 2, 2023

variant

I do not understand your meaning, I do not care the ScalarTypeOptional use in torch.sum(), I just want to use like python

x = torch.sum(x,dim=1)

torch.sum(x,dim=1), how to write the java/scala code do that use javacpp-pytorch and really work get correctly result

@mullerhai
Copy link
Author

I think javacpp-pytorch should write some example like mnist ,tell the user how to correctly use the torch.sum and so on torch.nn.functional, and EmbeddingImpl layer these often use

@saudet
Copy link
Member

saudet commented Jan 2, 2023

variant

I do not understand your meaning, I do not care the ScalarTypeOptional use in torch.sum(), I just want to use like python

x = torch.sum(x,dim=1)

torch.sum(x,dim=1), how to write the java/scala code do that use javacpp-pytorch and really work get correctly result

I'm guessing something like x = sum(x, new OptionalIntArrayRef(1)) works for that.

I think javacpp-pytorch should write some example like mnist ,tell the user how to correctly use the torch.sum and so on torch.nn.functional, and EmbeddingImpl layer these often use

Sure, like I keep telling you, contributions are welcome!

@HGuillemet
Copy link
Collaborator

As explained above, new OptionalIntArrayRef(1) creates an array of 1 OptionalIntArrayRef.
You can try new OptionalIntArrayRef().put(new IntArrayRef().data().put(1)).
I'm not sure why Tensor.sum(long...) doesn't exist anymore. I'm using 1.5.8 and it's there.

@HGuillemet
Copy link
Collaborator

Or rather new OptionalIntArrayRef().put(new IntArrayRef(new int[] { 1 }, 1))
Or maybe simply new OptionalIntArrayRef().put(new IntArrayRef(1))

@mullerhai
Copy link
Author

Or rather new OptionalIntArrayRef().put(new IntArrayRef(new int[] { 1 }, 1)) Or maybe simply new OptionalIntArrayRef().put(new IntArrayRef(1))

thanks,but for torch.sum also can not get correctly result

I write code is here

    import org.bytedeco.pytorch.IntArrayRef
    import org.bytedeco.pytorch.OptionalIntArrayRef
    val dim = new OptionalIntArrayRef().put(new IntArrayRef(Array[Int](1), 1))

    x =  torch.sum(x,dim)  

I think could you try to debug the torch.sum() in javacpp-pytorch in version 1.13.0-1.5.9-SNAPSHOT , in old version Tensor.sum(long...) is very easy to use ,why in new version these declare parameter not contain in new version

saudet added a commit that referenced this issue Jan 3, 2023
@saudet
Copy link
Member

saudet commented Jan 3, 2023

I think could you try to debug the torch.sum() in javacpp-pytorch in version 1.13.0-1.5.9-SNAPSHOT , in old version Tensor.sum(long...) is very easy to use ,why in new version these declare parameter not contain in new version

I don't know why, you should ask upstream about that. It's probably already explained in the issues somewhere there:
https://github.com/pytorch/pytorch/issues

In any case, if I understand the docs correctly, OptionalIntArrayRef is only useful for temporary objects, so I've resimplified that a bit in commit b6696fa and we should now be able to do x = sum(x, new LongArrayRefOptional(1)).

@mullerhai
Copy link
Author

mullerhai commented Jan 3, 2023

b6696fa

I has run the process as your code example wrote with 1.13.1-1.5.9-SNAPSHOT, but not correct effect for the sum dim,feel confused ,if the second parameter could represent the dim selected?

    val dim = new LongArrayRefOptional(1)
    x =  torch.sum(x,dim) 

get the console error

Exception in thread "main" java.lang.RuntimeException: Dimension out of range (expected to be in range of [-2, 1], but got 220803991521792)
Exception raised from maybe_wrap_dims_n at /Users/runner/work/javacpp-presets/javacpp-presets/pytorch/cppbuild/macosx-x86_64/pytorch/aten/src/ATen/WrapDimUtils.h:68 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) + 81 (0x10cbd4991 in libc10.dylib)
frame #1: void at::maybe_wrap_dims<c10::SmallVector<long long, 5u> >(c10::SmallVector<long long, 5u>&, long long) + 276 (0x18e94f704 in libtorch_cpu.dylib)
frame #2: at::meta::resize_reduction(at::impl::MetaBase&, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::ScalarType) + 1289 (0x18e94f2d9 in libtorch_cpu.dylib)
frame #3: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>), &(at::(anonymous namespace)::wrapper_sum_dim_IntList(at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>))>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType> > >, at::Tensor (at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 84 (0x18f6fcd04 in libtorch_cpu.dylib)
frame #4: at::_ops::sum_dim_IntList::redispatch(c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 133 (0x18f3607d5 in libtorch_cpu.dylib)
frame #5: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>), &(torch::autograd::VariableType::(anonymous namespace)::sum_dim_IntList(c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>))>, at::Tensor, c10::guts::typelist::typelist<c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType> > >, at::Tensor (c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 965 (0x190ed6c85 in libtorch_cpu.dylib)
frame #6: at::_ops::sum_dim_IntList::call(at::Tensor const&, c10::OptionalArrayRef<long long>, bool, c10::optional<c10::ScalarType>) + 363 (0x18f2caadb in libtorch_cpu.dylib)
frame #7: Java_org_bytedeco_pytorch_global_torch_sum__Lorg_bytedeco_pytorch_Tensor_2Lorg_bytedeco_pytorch_LongArrayRefOptional_2 + 215 (0x1287c2357 in libjnitorch.dylib)
frame #8: 0x0 + 4557236872 (0x10fa1ea88 in ???)

	at org.bytedeco.pytorch.global.torch.sum(Native Method)

@mullerhai
Copy link
Author

x = sum(x, new LongArrayRefOptional(1))

if use DimnameArrayRef also get error wrote with 1.13.1-1.5.9-SNAPSHOT,

    val dis = new DimnameArrayRef()
    dis.put(new LongArrayRef(Array[Long](1),1))
//    y = torch.sum(y, new LongArrayRefOptional(1))
    y = torch.sum(y, dis)

the console error

Exception in thread "main" java.lang.RuntimeException: Name 'prim' not found in Tensor[None, None].
Exception raised from dimname_to_position at /Users/runner/work/javacpp-presets/javacpp-presets/pytorch/cppbuild/macosx-x86_64/pytorch/aten/src/ATen/NamedTensorUtils.cpp:22 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >) + 81 (0x10a685da1 in libc10.dylib)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) + 98 (0x10a683492 in libc10.dylib)
frame #2: at::dimname_to_position(at::Tensor const&, at::Dimname) + 266 (0x19813b89a in libtorch_cpu.dylib)
frame #3: at::dimnames_to_positions(at::Tensor const&, c10::ArrayRef<at::Dimname>) + 155 (0x19813cf4b in libtorch_cpu.dylib)
frame #4: at::native::sum(at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>) + 36 (0x19878f774 in libtorch_cpu.dylib)
frame #5: c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoFunctor_<c10::CompileTimeFunctionPointer<at::Tensor (at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>), &(at::(anonymous namespace)::(anonymous namespace)::wrapper_dim_DimnameList_sum(at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>))>, at::Tensor, c10::guts::typelist::typelist<at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType> > >, at::Tensor (at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>) + 35 (0x199884ce3 in libtorch_cpu.dylib)
frame #6: at::_ops::sum_dim_DimnameList::call(at::Tensor const&, c10::ArrayRef<at::Dimname>, bool, c10::optional<c10::ScalarType>) + 363 (0x1990f9eeb in libtorch_cpu.dylib)
frame #7: Java_org_bytedeco_pytorch_global_torch_sum__Lorg_bytedeco_pytorch_Tensor_2Lorg_bytedeco_pytorch_DimnameArrayRef_2 + 215 (0x18c1f9ce7 in libjnitorch.dylib)
frame #8: 0x0 + 4637107184 (0x11464a3f0 in ???)

saudet added a commit that referenced this issue Jan 3, 2023
@saudet
Copy link
Member

saudet commented Jan 3, 2023

Right, this isn't going to work either. I've fixed it in commit 988101d so that x = sum(x, 1) should work like before.

@mullerhai
Copy link
Author

mullerhai commented Jan 4, 2023

Right, this isn't going to work either. I've fixed it in commit 988101d so that x = sum(x, 1) should work like before.

thanks, but these update in which version? 1.13.1-1.5.9-SNAPSHOT ? 1.13.2-1.5.9-SNAPSHOT no exist

@saudet
Copy link
Member

saudet commented Jan 4, 2023

If you're relying on the snapshots, make sure to update your cache

@mullerhai
Copy link
Author

If you're relying on the snapshots, make sure to update your cache

Now the torch.sum() can corectly work,thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants