`mul!()` of `LazyProduct` result in high alloc in integration #333

Lightup1 · 2022-04-30T04:50:53Z

function mul!(result::Ket{B1},a::LazyProduct{B1,B2},b::Ket{B2},alpha,beta) where {B1,B2}
    tmp1 = Ket(a.operators[end].basis_l)
    mul!(tmp1,a.operators[end],b,a.factor,0)
    for i=length(a.operators)-1:-1:2
        tmp2 = Ket(a.operators[i].basis_l)
        mul!(tmp2,a.operators[i],tmp1)
        tmp1 = tmp2
    end
    mul!(result,a.operators[1],tmp1,alpha,beta)
    return result
end

tmp1 and tmp2 can cause high alloc when the integration tspan is large.

The text was updated successfully, but these errors were encountered:

Lightup1 · 2022-04-30T05:00:10Z

Speculate that adding fields of Ket, Bra, and combinedoperater may resolve the use of tmp, but it may make LazyProduct a little bit fat.

Lightup1 · 2022-04-30T08:46:37Z

As a test add KTL and BTL

mutable struct LazyProduct{BL,BR,F,T,KTL,BTR} <: AbstractOperator{BL,BR}
    basis_l::BL
    basis_r::BR
    factor::F
    operators::T
    ket_l::KTL
    bra_r::BTR
    function LazyProduct{BL,BR,F,T,KTL,BTR}(operators::T, factor::F=1) where {BL,BR,F,T,KTL,BTR}
        for i = 2:length(operators)
            check_multiplicable(operators[i-1], operators[i])
        end
        ket_l=[Ket(operator.basis_l) for operator in operators]
        bra_r=[Bra(operator.basis_r) for operator in operators]
        new(operators[1].basis_l, operators[end].basis_r, factor, operators,ket_l,bra_r)
    end
end
function LazyProduct(operators::T, factor::F=1) where {T,F}
    BL = typeof(operators[1].basis_l)
    BR = typeof(operators[end].basis_r)
    KTL = typeof([Ket(operator.basis_l) for operator in operators])
    BTR = typeof([Bra(operator.basis_r) for operator in operators])
    LazyProduct{BL,BR,F,T,KTL,BTR}(operators, factor)
end

function mul!(result::Ket{B1},a::LazyProduct{B1,B2},b::Ket{B2},alpha,beta) where {B1,B2}
    mul!(a.ket_l[end],a.operators[end],b,a.factor,0)
    for i=length(a.operators)-1:-1:2
        mul!(a.ket_l[i],a.operators[i],a.ket_l[i+1])
        # a.ket_l[i+1]=Ket(a.operators[i].basis_l)
    end
    mul!(result,a.operators[1],a.ket_l[2],alpha,beta)
    # a.ket_l[2]=Ket(a.operators[2].basis_l)
    return result
end

@benchmark QuantumOptics.mul!(dpsi,H_kin,Ψ0)

BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min … max):  1.740 μs …  4.260 μs  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     1.780 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.795 μs ± 91.743 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▁▂ ▆█▆ ▃▂ ▂                                               ▂
  ▇██▁███▁██▁██▅▁▅▅▅▁▃▄▁▃▅▄▁▅▄▃▁▃▅▁▆▇█▁█▇▁▇▆▆▁▆▅▅▁▅▄▁▄▄▅▁▄▄▄ █
  1.74 μs      Histogram: log(frequency) by time     2.16 μs <

 Memory estimate: 0 bytes, allocs estimate: 0.

Now the alloc is irrelevant with the integration time
@benchmark tout, Ψt = timeevolution.schroedinger($T, $Ψ0, $H)

BenchmarkTools.Trial: 272 samples with 1 evaluation.
 Range (min … max):  17.615 ms … 37.316 ms  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     18.155 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   18.400 ms ±  1.428 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▁▃▅▃▅▇██▅▂ 
  ▆██████████▆▆▆▁▇▇▁▆▆▄▁▁▇█▄▁▄▁▆▁▁▁▁▁▁▁▁▁▆▄▄▁▁▁▁▁▁▁▁▁▄▁▁▁▁▄▄▄ ▆
  17.6 ms      Histogram: log(frequency) by time      21.9 ms <

 Memory estimate: 38.00 KiB, allocs estimate: 70.

david-pl · 2022-04-30T10:02:12Z

Yes, that mul! can be optimized. Caching the Ket and Bra like you do is a good idea in principle.

One issue here is that the data type of the Ket you multiply with is not conserved. What I mean is, that e.g. Ket(operator.basis_l) in your ket_l does not specifiy the data type of the data of the Ket, so Vector{ComplexF64} is assumed. This means, however, if you try to work with, e.g. Vector{ComplexF32}, the data type will not be conserved when multiplying such a Ket with a LazyProduct.

This issue is actually a bit tricky as it means you need to take into account the type of result from the mul! method for the cache of LazyProduct. So you cannot fully know this when constructing the LazyProduct. You could fill the cache in mul! directly, since you'd have all the info. Then you just need a fast way of checking if the cache has already been filled for the current data types involved in the mul! operation and don't reconstruct it if it's already valid.

That said, the current version of the code (on master) doesn't respect the data type either. So what you're doing is definitely an improvement, and we could leave the data type issue for another time. Would you like to open a PR with the changes you suggest above?

Lightup1 · 2022-04-30T10:50:30Z

OK, I’ll open a PR. But I’m new in programming, may take some time to know how to use GitHub PR.

Lightup1 · 2022-04-30T14:50:23Z

> git push -u origin EffientLazyProduct_mul
remote: Permission to qojulia/QuantumOptics.jl.git denied to Lightup1.
fatal: unable to access 'https://github.com/qojulia/QuantumOptics.jl.git/': The requested URL returned error: 403

After setup for a whole night, I find out that I have no permission.😂😂

david-pl · 2022-04-30T15:34:54Z

@Lightup1 that's not how you submit a PR... you need to fork the repository. See for example this guide: https://code.tutsplus.com/tutorials/how-to-collaborate-on-github--net-34267

Lightup1 · 2022-04-30T17:52:46Z

Oh, thank you. Got it!😅

amilsted · 2022-05-03T15:08:58Z

An alternative is to use caching to handle the tmp vectors, like in this PR. We could use the same cache in LazyProduct and LazyTensor (which is also a kind of lazy product operator).

I thought about using the approach described here, but it does seem a bit much to put all possible intermediate types (those for kets, bras, and operators!) in the operator struct.

A third way is to put a cache in the operator struct. That way only needed arrays get created and the memory is freed when the operator is garbage-collected.

Lightup1 · 2022-05-05T00:47:55Z

Thanks for your suggestions.

A third way is to put a cache in the operator struct. That way only needed arrays get created and the memory is freed when the operator is garbage-collected.

I wonder how we define the type of cache since it can be Ket or Operater with different basis.

amilsted · 2022-05-05T09:46:43Z

Yes, the typing issue is a little tricky, also because element types can vary, as @david-pl points out above. It may not be too terrible to use abstract types, such as an appropriate union, or even Any, as is used in the global cache of the PR I linked. There doesn't seem to be a significant performance hit in cases I have tested.

Lightup1 changed the title ~~mul! of LazyProduct result high alloc in integration~~ mul! of LazyProduct result in high alloc in integration Apr 30, 2022

Lightup1 changed the title ~~mul! of LazyProduct result in high alloc in integration~~ mul!() of LazyProduct result in high alloc in integration Apr 30, 2022

Lightup1 changed the title ~~mul!() of LazyProduct result in high alloc in integration~~ mul!() of LazyProduct result in high alloc in integration Apr 30, 2022

This was referenced Dec 19, 2022

Plus method for LazyTensor and LazySum qojulia/WaveguideQED.jl#7

Closed

Lazy operator qojulia/WaveguideQED.jl#6

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`mul!()` of `LazyProduct` result in high alloc in integration #333

`mul!()` of `LazyProduct` result in high alloc in integration #333

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

david-pl commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

david-pl commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

amilsted commented May 3, 2022 •

edited

Loading

Lightup1 commented May 5, 2022 •

edited

Loading

amilsted commented May 5, 2022

mul!() of LazyProduct result in high alloc in integration #333

mul!() of LazyProduct result in high alloc in integration #333

Comments

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

david-pl commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

david-pl commented Apr 30, 2022

Lightup1 commented Apr 30, 2022

amilsted commented May 3, 2022 • edited Loading

Lightup1 commented May 5, 2022 • edited Loading

amilsted commented May 5, 2022

`mul!()` of `LazyProduct` result in high alloc in integration #333

`mul!()` of `LazyProduct` result in high alloc in integration #333

amilsted commented May 3, 2022 •

edited

Loading

Lightup1 commented May 5, 2022 •

edited

Loading