Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean separation between functionality of core and device. #1240

Open
pratikvn opened this issue Dec 13, 2022 · 1 comment
Open

Clean separation between functionality of core and device. #1240

pratikvn opened this issue Dec 13, 2022 · 1 comment
Labels
is:idea Just a thought - if it's good, it could evolve into a proposal. is:proposal Maybe we should do something this way. mod:all This touches all Ginkgo modules.

Comments

@pratikvn
Copy link
Member

This issue is in reference to the discussion regarding having sequential operations run on the host rather than on the device kernels (reference, openmp, cuda etc).

I would propose for having a clean separation between memory allocations/de-allocations and any operations that perform data manipulation. IMO, Ginkgo's philosophy has been to have the core orchestrate and dispatch the kernels, allocate and manage memory, but not perform any operations. This clean separation

  1. Allows us the benefit of easy extension to multiple backends and addition of new algorithms just to the kernels rather than modifying the core.
  2. A simpler logger and profiler interface and output, with operations and allocations clearly marked and distinguishable.
  3. Easier extension to task based approaches due to the distinct nature of the device kernels.

I understand that this is a bit challenging due to the fact that for many algorithms (SpGEMM, SpGEAM and factorizations), in develop (and also in release) we currently combine both allocations and operations because separation is more difficult in those cases, especially where the algorithm is inherently sequential.

@pratikvn pratikvn added is:proposal Maybe we should do something this way. is:idea Just a thought - if it's good, it could evolve into a proposal. mod:all This touches all Ginkgo modules. labels Dec 13, 2022
@upsj
Copy link
Member

upsj commented Dec 13, 2022

Thanks for kicking off this discussion, I think our separation into core and device is a strong suit of our project that we should make sure to keep up. There are even a few cases (e.g. CsrBuilder) where we break up this separation, which in hindsight are not that well justified. I have some ideas on how to improve this, but that requires a bit more work.

I wanted to address a few of the individual points though, since I believe they may be a bit inaccurate in places:

  1. I think forcing allocations to only happen on the core side would blow up the complexity of core to an unjustified degree. For an example, look at matrix::Fbcsr::read or distributed::Matrix::read: There we have a large amount of intermediate data of a-priori unknown size that would require a lot of individual kernels, making the control flow really hard to follow or debug (especially since every kernel call goes through one level of macros and two levels of runtime dispatch)
  2. I can't really follow how this decision would impact profiling loggers. A profiler logger basically just annotates the execution timeline with events, it doesn't matter where they happen. NSight Systems and rocprof already resolve kernel launches inside the ranges, so we don't need to differentiate how we report host vs. device kernels.
  3. It is not really clear to me how host operations should make task-based execution any easier or harder. The more complicated problems of task dependencies and required data movement cannot be answered based on the function signatures alone, so they will require significant changes on the host side (maybe even more significant than on the device side) as well.

As an alternative description for the separation between core and kernels, I would propose the IMO much more precise distinction "does it only have a single implementation or does it have multiple implementations". There is no need to talk about abstract concepts of what is high-level and low-level if we already have a suitable technical justification for why we need this complex separation into different libraries, and where it is more of an obstacle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
is:idea Just a thought - if it's good, it could evolve into a proposal. is:proposal Maybe we should do something this way. mod:all This touches all Ginkgo modules.
Projects
None yet
Development

No branches or pull requests

2 participants