Add scripts for benchmarking opt_einsum as well as manyinds benchmarks #3

kshyatt · 2021-09-21T16:56:18Z

Currently we don't benchmark computations like this with many indices. I add benchmarks for these (see also the PR to OMEinsum.jl) and scripts for benchmarking the Python library opt_einsum.

GiggleLiu · 2021-09-21T17:39:26Z

Thanks for the PR!
Regarding to the following contraction pattern in which OMEinsum performs bad,

julia> code = ein"abcdefghijklmnop,flnqrcipstujvgamdwxyz->bcdeghkmnopqrstuvwxyz"
abcdefghijklmnop, flnqrcipstujvgamdwxyz -> bcdeghkmnopqrstuvwxyz

julia> OMEinsum.timespace_complexity(code, uniformsize(code, 2))
(26.0, 21.0)

julia> @btime code(x, y);
  24.924 ms (111 allocations: 48.51 MiB)

This is because OMEinsum uses the permutedims + reshape + matmul to perform tensor contraction. While in this pattern, the permutedims function takes 90% of the time because time and space complexity are similar. If we switch to TensorOperations.tensorcopy, the time can be halved.

julia> using LinearAlgebra, TensorOperations

julia> LinearAlgebra.permutedims(a::Array{T,N}, perm::NTuple{N}) where {T,N} = (TensorOperations.tensorcopy(a, collect(1:ndims(a)), perm))

julia> LinearAlgebra.permutedims!(o::Array{T}, a::Array{T}, perm::Vector) where T = (TensorOperations.tensorcopy!(a, collect(1:ndims(a)), o, perm))

julia> @btime code(x, y);
  13.353 ms (398 allocations: 48.54 MiB)

I think the best way to sovle this issue is to remove permutedims completely.
However, it is not easy to write a general contraction function with BLAS performance. Wondering what is the performance gap between OMEinsum and other packages in this benchmarking case?

GiggleLiu · 2021-09-21T20:40:36Z

Great job, wondering how the julia-gpu data is generated, I can not find the corresponding script anywhere.

And unfortunately, I do not have the write access to this repo. @under-Peter Can you please give me a write access or help merge this PR.

under-Peter · 2021-09-21T20:54:07Z

@GiggleLiu is it enough to add you as a collaborator or do i have to give you write access separately? Thanks so much and sorry for being short on time atm.

GiggleLiu · 2021-09-21T21:04:10Z

Now I have the write access, thanks for reacting so fast, @under-Peter.

kshyatt · 2021-09-22T17:45:55Z

Is this good to be merged?

GiggleLiu · 2021-09-22T19:35:03Z

It looks good, but I want to make sure the benchmark on GPU is correct (especially the manyindex case), it is different from the running the case on my own host. Can you please show me the script generating the result?

After that, it should be good to merge.

Add scripts for benchmarking opt_einsum as well as manyinds benchmarks

7e43a0a

kshyatt mentioned this pull request Sep 21, 2021

Add many-index benchmark under-Peter/OMEinsum.jl#118

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scripts for benchmarking opt_einsum as well as manyinds benchmarks #3

Add scripts for benchmarking opt_einsum as well as manyinds benchmarks #3

kshyatt commented Sep 21, 2021

GiggleLiu commented Sep 21, 2021 •

edited

Loading

GiggleLiu commented Sep 21, 2021

under-Peter commented Sep 21, 2021

GiggleLiu commented Sep 21, 2021

kshyatt commented Sep 22, 2021

GiggleLiu commented Sep 22, 2021 •

edited

Loading

Add scripts for benchmarking opt_einsum as well as manyinds benchmarks #3

Are you sure you want to change the base?

Add scripts for benchmarking opt_einsum as well as manyinds benchmarks #3

Conversation

kshyatt commented Sep 21, 2021

GiggleLiu commented Sep 21, 2021 • edited Loading

GiggleLiu commented Sep 21, 2021

under-Peter commented Sep 21, 2021

GiggleLiu commented Sep 21, 2021

kshyatt commented Sep 22, 2021

GiggleLiu commented Sep 22, 2021 • edited Loading

GiggleLiu commented Sep 21, 2021 •

edited

Loading

GiggleLiu commented Sep 22, 2021 •

edited

Loading