Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add orthogonalization options to GMRES #1646

Merged
merged 6 commits into from
Aug 9, 2024
Merged

Add orthogonalization options to GMRES #1646

merged 6 commits into from
Aug 9, 2024

Conversation

nbeams
Copy link
Collaborator

@nbeams nbeams commented Jul 11, 2024

Add orthogonalization choices for GMRES (CGS and CGS2 in addition to the MGS available before).

This adds a new unified kernel for performing the multidot operation for CGS/CGS2. It also changes the data layout of the Hessenberg matrix so that all the data that needs to be communicated after the multidot is stored contiguously, and can be handled with one communication.

TODO:

  • Benchmark check to make sure MGS performance is not hurt by the changes to the Hessenberg data layout.

@nbeams nbeams added is:enhancement An improvement of an existing feature. type:solver This is related to the solvers mod:all This touches all Ginkgo modules. labels Jul 11, 2024
@ginkgo-bot ginkgo-bot added reg:testing This is related to testing. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:reference This is related to the reference module. mod:hip This is related to the HIP module. labels Jul 11, 2024
@nbeams
Copy link
Collaborator Author

nbeams commented Jul 15, 2024

Some initial benchmarks comparing old/new Hessenberg use in MGS for the affected kernels (hessenberg_qr and solve_krylov). These were run on Guyot, so A100 GPUs and 2 AMD EPYC 7742 CPUs. The size parameter is the size given to generate the stencil matrix, using the 7 pt stencil, with weak scaling for increased MPI ranks. The points with the same size have various MPI ranks (1, 2, 4, or 8) and nrhs (1, 10, or 100).

CUDA - Hessenberg QR
CUDA - Solve Krylov
OpenMP - Hessenberg QR
OpenMP - Solve Krylov

To me, this seems to indicate that any difference between the old and new versions is basically just noise (especially as the absolute times for solve_krylov are so tiny). But I can run more tests if this isn't considered conclusive enough or looks odd to anybody.

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I've left mostly minor comments, except for the shape of hessenberg_iter.

core/solver/gmres_kernels.hpp Show resolved Hide resolved
core/solver/gmres.cpp Outdated Show resolved Hide resolved
core/solver/gmres.cpp Outdated Show resolved Hide resolved
core/solver/gmres.cpp Outdated Show resolved Hide resolved
core/solver/gmres_kernels.hpp Show resolved Hide resolved
core/solver/gmres.cpp Outdated Show resolved Hide resolved
common/unified/solver/common_gmres_kernels.cpp Outdated Show resolved Hide resolved
@pratikvn pratikvn self-requested a review July 22, 2024 12:32
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Some minor nits. Thanks for adding the performance comparisons.

core/solver/gmres.cpp Show resolved Hide resolved
core/solver/gmres.cpp Outdated Show resolved Hide resolved
core/solver/gmres.cpp Show resolved Hide resolved
core/solver/gmres.cpp Show resolved Hide resolved
core/solver/gmres.cpp Show resolved Hide resolved
core/solver/gmres.cpp Show resolved Hide resolved
include/ginkgo/core/solver/gmres.hpp Outdated Show resolved Hide resolved
test/solver/gmres_kernels.cpp Outdated Show resolved Hide resolved
test/solver/gmres_kernels.cpp Show resolved Hide resolved
test/solver/gmres_kernels.cpp Outdated Show resolved Hide resolved
@nbeams nbeams added the 1:ST:ready-to-merge This PR is ready to merge. label Aug 8, 2024
@nbeams nbeams merged commit 2268382 into develop Aug 9, 2024
10 of 14 checks passed
@nbeams nbeams deleted the gmres_orthog_choice branch August 9, 2024 04:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. is:enhancement An improvement of an existing feature. mod:all This touches all Ginkgo modules. mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. mod:reference This is related to the reference module. reg:testing This is related to testing. type:solver This is related to the solvers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants