Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMO tests fail with Ubuntu's OpenMPI 4.1.2 #1

Open
mrogowski opened this issue Jan 20, 2023 · 3 comments
Open

AMO tests fail with Ubuntu's OpenMPI 4.1.2 #1

mrogowski opened this issue Jan 20, 2023 · 3 comments

Comments

@mrogowski
Copy link
Contributor

mrogowski commented Jan 20, 2023

I, L, Q, u4, u8 data types cause error in testFetchBitwise when using shmem4py with Ubuntu's OpenMPI (4.1.2):

File "/repo/test/test_amo.py", line 189, in testFetchBitwise
self.assertEqual(val, 2**i-1)

See: https://github.com/mpi4py/shmem4py/actions/runs/3967285365/jobs/6799064824

I cannot reproduce the issue with OpenMPI 4.1.2 and UCX 1.12.1 built from source.

@dalcinl
Copy link
Member

dalcinl commented Jan 22, 2023

Maybe the issue comes after my changes in 0f21f5c ?

@mrogowski
Copy link
Contributor Author

No, the same tests fail before that change.

@mrogowski
Copy link
Contributor Author

This issue is reproducible in C and seems to be dependent on GCC optimizations of UCX:

  • Fedora 35 rpm of OpenMPI 4.1.1 + UCX 1.11.2 works (GCC 11)
  • Fedora 36 rpm of OpenMPI 4.1.4 + UCX 1.12.0 fails (GCC 12)
  • Ubuntu 22.04 deb of OpenMPI 4.1.2 + UCX 1.12.1 fails (GCC 11)
  • Ubuntu 23.04 deb of OpenMPI 4.1.4 + UCX 1.13.1 fails (GCC 12)
  • Fedora 35 own build of OpenMPI 4.1.4 + UCX 1.13.1 works (GCC 11)
  • Fedora 36 own build of OpenMPI 4.1.4 + release build of UCX 1.13.1 fails (GCC 12)
  • Fedora 36 own build of OpenMPI 4.1.4 + release build of UCX (master/openucx/ucx@52a9394) fails (GCC 12)
  • Fedora 36 own build of OpenMPI 4.1.4 + devel build of UCX (master/openucx/ucx@52a9394) works (GCC 12)
  • Ubuntu 22.04 own build of OpenMPI 4.1.4 + UCX 1.13.1 works (GCC 11)
  • Ubuntu 23.04 own build of OpenMPI 4.1.4 + UCX 1.13.1 fails (GCC 12)
  • Ubuntu 23.04 own build of OpenMPI 4.1.4 + release build of UCX (master/openucx/ucx@52a9394) fails (GCC 12)
  • Ubuntu 23.04 own build of OpenMPI 4.1.4 + devel build of UCX (master/openucx/ucx@52a9394) works (GCC 12)

I'm using the master branch of UCX because as of UCX 1.13.1 release, devel build fails with GCC 12 (openucx/ucx#8186, openucx/ucx#8617).

I will use devel builds in CI/CD for now.

Update: It seems like the issue is somehow caused by the --disable-logging flag of the release build of UCX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants