Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make MPItrampoline the default #801

Open
simonbyrne opened this issue Nov 14, 2023 · 6 comments
Open

Make MPItrampoline the default #801

simonbyrne opened this issue Nov 14, 2023 · 6 comments
Milestone

Comments

@simonbyrne
Copy link
Member

simonbyrne commented Nov 14, 2023

I'll admit to initially being a bit reluctant to making MPItrampline the default MPI implementation, but given
a. it has so far proven fairly successful, and
b. the situation with binary dependencies without MPItrampoline is somewhat difficult (JuliaPackaging/Yggdrasil#6893).
I think it might be worth now making the switch. Any thoughts?

I think my main questions are:

  • are there any platforms/MPI libraries it doesn't work with?
  • what would happen "out of the box"? I like that people don't have to install their own MPI implementation for single node stuff.
  • could we make it easier to build MPIwrapper? e.g. install it in a https://github.com/JuliaPackaging/Scratch.jl space?

cc @eschnett @lcw

@simonbyrne
Copy link
Member Author

In particular, @JBlaschke @luraess I was wondering if you've tried MPItrampoline with Cray MPI?

@sloede
Copy link
Member

sloede commented Nov 15, 2023

what would happen "out of the box"? I like that people don't have to install their own MPI implementation for single node stuff.

I've been wondering this myself. Wouldn't MPItrampoline being the default mean that you'd have to have another, MPIwrapper'd MPI implementation installed in a JLL package? And wouldn't this just shift the issues once house further down the road?

TBH, having the ability to run MPI stuff out of the box locally one of the greatest selling points for using Julia for HPC stuff. Thus I strongly urge to follow any path that will force Julia users to manually install MPI on their machine if they want to use MPI-enabled packages.

What about your latest suggestion in JuliaPackaging/Yggdrasil#6893 (comment) as a potential fix instead?

@luraess
Copy link
Contributor

luraess commented Nov 15, 2023

TBH I do not fully grasp the MPITrampoline thing. Thus it is also hard for me to give any pros and cons, and I did not yet test it in any of my setups, locally, on servers, not on Cray (Piz Daint or LUMI).

I would be happy to help though and could test things on any of these machines. But ideally, I would need a short explanation about MPITrampoline.

I agree with @sloede that one of the greatest feature of MPI.jl is that it runs out of the box on local machines, which is ideal for prototyping and small runs - so we should absolutely not go away from that. Then, it seems to be fairly friendly hooking into system installed MPI on servers (mostly OpenMPI) and LUMI as well.

@eschnett
Copy link
Contributor

eschnett commented Nov 16, 2023

Let me try to avoid confusion:

  • Currently, MPI.jl uses MPICH_jll by default, and locally everything "just works". If we switch MPI.jl to using MPItrampoline_jll by default then everything continues to "just work". (MPItrampoline_jll comes with a built-in MPICH for this so people wouldn't notice the difference.)
  • MPIwrapper provides the unified ABI. MPIwrapper is only needed if you want to use an external MPI implementation. You need to build MPIwrapper (using cmake) against that MPI implementation, and then set an environment variable to point MPItrampoline_jll to that MPIwrapper. I think MPIPreferences could or should do that given a simple setup command.

Reiterating:

  • If you want to use MPI.jl out of the box you can just switch over. I do that locally for testing.
  • If you want to use an external MPI then you need to wrap it and point MPI.jl to it. All other Yggdrasil packages will then automatically also use that MPI implementation, and things will continue to work.

The two remaining issues are:

  • Pointing MPI.jl to an external MPI implementation isn't a streamlined process yet. It could be streamlined, all it needs to do is remember the path to the external MPI implementation.
  • When you use mpirun with an external MPI implementation to start Julia this happens outside Julia (by definition). It's your responsibility to use the correct one.

There is at least on package that does not work with MPItrampoline. This is a Fortran 90 package. (I misread the Fortran MPI standard because there was a confusing paragraph.) This is not easy to fix. I know of only one such package.

There are several packages that misread the MPI standard and that don't work with MPItrampoline. I've tried to upstream patches. Most have accepted the respective patches. All (I think?) packages in Yggdrasil support MPItrampoline.

I've tested things on a variety of HPC systems. I have not encountered any significant problems. The only "difficulty" is to build MPIwrapper, which is a rather simple standard-conforming package that uses MPI via cmake, and it's necessary to point cmake to the respective system MPI implementation. That is surprisingly difficult. I wish MPI implementations provided a pkgconfig file or a cmake configuration file. The current way which relies on calling mpirun to find out where mpicc is installed to then second-guess how to make mpicc spit out the compiler flags and cross fingers and hope that the right modules are loaded at run time is atrocious and not reliable.

@simonbyrne
Copy link
Member Author

quoting @JBlaschke

Yay

@simonbyrne simonbyrne added this to the 1.0 milestone Nov 28, 2023
@simonbyrne
Copy link
Member Author

Anyone want to open a PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants