No performance improvement when tested a example of cb-gmres #1383

tomy-lang · 2023-08-07T09:44:50Z

I tested the ginkgo project on the x86_64 linux OS with the cpu（Intel(R) Xeon(R) Gold 6248R CPU @ 3.00GHz), the cb-gmers (the path in the project is ginkgo-root-dir/examples/cb-gmers/cb-gmers.cpp) had been tested. the only one variable of reduction_factor{1e-6} was set to reduction_factor{1e-30}. due to the reduction_factor is very small(1e-30), the iterations both of without compression and with compression would been 1000 times.

It was confused that why the slove-time of with compress is higher as show in following result?

firstly: cd /ginkgo-root-dir/examples/cb-gmers/
My tested script is shown as:
rm cb-gmres
./build.sh /home/tangsx/ginkgo-install/
taskset -c 2 ./cb-gmres
in which，/home/tangsx/ginkgo-install/ is the path of ginkgo installed dir.

the result is shown as following:

1st test dataset
note: the A.mtx is 36*36 matrix(copy from /ginkgo-root-dir/matrices/test/ani1.mtx )
This is Ginkgo 1.6.0 (master)
running with core module 1.6.0 (master)
the reference module is 1.6.0 (master)
the OpenMP module is 1.6.0 (master)
the CUDA module is not compiled
the HIP module is not compiled
the DPCPP module is not compiled
Solve time without compression: 1.919823e-02 s
Solve time with compression: 1.982546e-02 s

Residual norm without compression:
%%MatrixMarket matrix array real general
1 1
8.766243e-16

Residual norm with compression:
%%MatrixMarket matrix array real general
1 1
7.792385e-16

2nd test dataset
note: the A.mtx is 3081*3081 matrix(copy from /ginkgo-root-dir/matrices/test/ani4.mtx )
This is Ginkgo 1.6.0 (master)
running with core module 1.6.0 (master)
the reference module is 1.6.0 (master)
the OpenMP module is 1.6.0 (master)
the CUDA module is not compiled
the HIP module is not compiled
the DPCPP module is not compiled
Solve time without compression: 6.577934e-01 s
Solve time with compression: 7.615828e-01 s

Residual norm without compression:
%%MatrixMarket matrix array real general
1 1
4.959973e-14

Residual norm with compression:
%%MatrixMarket matrix array real general
1 1
5.276808e-14

MarcelKoch · 2023-08-14T13:00:58Z

Hi @tomy-lang. These matrices are very small. I would not expect much difference for these sizes. The problem (matrix + vectors) might fit into the L1 or L2 cache, so memory accesses are relatively cheap. You could try out larger matrices, I would say one with a size >= 1 million rows. For that, our benchmarks are probably better suited, see our BENCHMARKING.md for details.

Side note: you can remove the ResidualNorm... criterion and then it will always just use the exact number of iteration.

upsj changed the title ~~No performance improvement when tested a example of cb-gmers~~ No performance improvement when tested a example of cb-gmres Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No performance improvement when tested a example of cb-gmres #1383

No performance improvement when tested a example of cb-gmres #1383

tomy-lang commented Aug 7, 2023 •

edited

Loading

MarcelKoch commented Aug 14, 2023

No performance improvement when tested a example of cb-gmres #1383

No performance improvement when tested a example of cb-gmres #1383

Comments

tomy-lang commented Aug 7, 2023 • edited Loading

MarcelKoch commented Aug 14, 2023

tomy-lang commented Aug 7, 2023 •

edited

Loading