Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROCm / HIP PATHs related ginkgo installation problems on Ubuntu 22.04 #1614

Open
klausbu opened this issue May 17, 2024 · 11 comments
Open

ROCm / HIP PATHs related ginkgo installation problems on Ubuntu 22.04 #1614

klausbu opened this issue May 17, 2024 · 11 comments

Comments

@klausbu
Copy link

klausbu commented May 17, 2024

I am trying to install ginkgo on Ubuntu 22.04. I have an up-to-date default installation of AMD ROCm 6.1.1 which works fine. The ginkgo installation process using cmake as described on the webpage, doesn't find AMD ROCm nor HIP or any of the required cmake files so I provided the paths:

export HIP_PATH=/opt/rocm

export hipblas_DIR=/opt/rocm/lib/cmake/hipblas

export CMAKE_PREFIX_PATH=/opt/rocm/lib/cmake/hip

export AMDDeviceLibs_DIR=/opt/rocm-6.1.1/lib/cmake/AMDDeviceLibs/

export amd_comgr_DIR=/opt/rocm-6.1.1/lib/cmake/amd_comgr/

The following one triggers an error: »hsa-runtime64_DIR=/opt/rocm-6.1.1/lib/cmake/hsa-runtime64/«: Ist kein gültiger Bezeichner
export hsa-runtime64_DIR=/opt/rocm-6.1.1/lib/cmake/hsa-runtime64/

I used the following cmake command: cmake -G "Unix Makefiles" -DGINKGO_BUILD_HIP=ON -DCMAKE_HIP_ARCHITECTURES="gfx1031" .. && cmake --build .

I assume the install package is not up-to-date regarding ROCm / HIP install paths?!

@upsj
Copy link
Member

upsj commented May 17, 2024

First, unfortunately your build will likely fail, since we don't support gfx10xx (yet), see #1429.
Second, these environment variables should no longer be necessary since #1334, as long as amdclang++ or hipcc can be found. Which commit are you looking at? Also what CMake version are you using?

@klausbu
Copy link
Author

klausbu commented May 17, 2024

The cmake version is cmake version 3.22.1

How can I specify the hipcc location during the installation process?

@upsj
Copy link
Member

upsj commented May 18, 2024

I think the easiest solution should be pointing HIPCXX at amdclang++, if it's not already in the PATH. Though it might also help just to try out a newer version of CMake, since HIP 6.1.1 came out quite some time after CMake 3.22.1, and there might be some changes to the CMake setup that are not reflected properly.

@klausbu
Copy link
Author

klausbu commented May 18, 2024

I am going in a circle, now HIPCXX is pointing to amdclang++ but the cmake flies are still not detected:


-- The HIP compiler identification is Clang 17.0.0
-- Detecting HIP compiler ABI info
-- Detecting HIP compiler ABI info - done
-- Check for working HIP compiler: /opt/rocm-6.1.1/llvm/bin/clang++ - skipped
-- Detecting HIP compile features
-- Detecting HIP compile features - done
CMake Error at cmake/hip.cmake:120 (find_package):
  By not providing "Findhipblas.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "hipblas", but
  CMake did not find one.

  Could not find a package configuration file provided by "hipblas" with any
  of the following names:

    hipblasConfig.cmake
    hipblas-config.cmake

  Add the installation prefix of "hipblas" to CMAKE_PREFIX_PATH or set
  "hipblas_DIR" to a directory containing one of the above files.  If
  "hipblas" provides a separate development package or SDK, be sure it has
  been installed.
Call Stack (most recent call first):
  CMakeLists.txt:78 (include)

-- Configuring incomplete, errors occurred!
See also "/home/klaus/Programme/ginkgo/build/CMakeFiles/CMakeOutput.log".
See also "/home/klaus/Programme/ginkgo/build/CMakeFiles/CMakeError.log".

@upsj
Copy link
Member

upsj commented May 19, 2024

Can you try setting -DCMAKE_PREFIX_PATH=/opt/rocm-6.1.1 as outlined in https://rocm.docs.amd.com/en/latest/conceptual/cmake-packages.html? By choosing to install ROCm in a non-standard location like /usr in their packages, AMD made it slightly harder for things to be found by default. Module systems on HPC clusters usually take care of that for you.

@klausbu
Copy link
Author

klausbu commented May 19, 2024

The following triggered the compilation on Ubuntu 22.04 with ROCm 6.1.1:

cmake -G "Unix Makefiles" -D GINKGO_BUILD_HIP=ON -D CMAKE_HIP_ARCHITECTURES="gfx1031" -D CMAKE_PREFIX_PATH=/opt/rocm-6.1.1 .. && cmake --build .

Now I need to look into the architecture specific warp size related compilation error that's discussed in the other thread.

@upsj
Copy link
Member

upsj commented May 19, 2024

Dealing with warp size 32 requires some refactoring on our side (since we assume the warp size is known on the host at compile time, this assumption is violated in a mixed gfx10xx/gfx9xx build), so it is unlikely that you will be able to fix it easily. In the short run, we can only support server-grade GPUs with warp size 64.

@klausbu
Copy link
Author

klausbu commented May 19, 2024

I don't run a mixed build, only -D CMAKE_HIP_ARCHITECTURES="gfx1031", the purpose is to test https://github.com/hpsim/OGL

@upsj
Copy link
Member

upsj commented May 19, 2024

IIRC the ROCm clang compiler always claims the warp size is 64 from the host side regardless of the device architecture, so that will not make a difference. You could try patching the warpSize to be 32 inside config.hip.hpp and see if it works?

@upsj
Copy link
Member

upsj commented May 19, 2024

I've been planning on setting up a CI system with a consumer GPU for a while now, I guess this is a good time to get started ;)

@klausbu
Copy link
Author

klausbu commented May 19, 2024

That's a very good idea, I have been looking into GPU compute for CFD for years but have not been able to confirm even one of the many speedup claims so I am not going to invest in server grade hardware before I have managed to setup an effective implementation of some kind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants