-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ucx + cupy-core unexpectedly pulls in cuda-cudart #172
Comments
I can reproduce with
I think something is wrong in ucx-split-feedstock/recipe/meta.yaml Line 33 in c7e9896
|
Thanks for confirming! I notice there are two builds (for linux-64):
In case it's helpful, it looks like there was a change in behavior in #117: Prior to that, this seems to work (does not pull in
After that PR, things turn bad (pulls in
|
yes, because @conda-forge-admin, please rerender |
Hi! This is the friendly automated conda-forge-webservice. I just wanted to let you know that I started rerendering the recipe in #173. |
Forgot to ask, @pentschev any chance you know if UCX by default links to cudart statically or dynamically? |
nvm, answering myself based on the CI log: @dmargala this is expected because UCX dynamically links to CUDART. This is why from the CUDA 11 pipeline we see
and the CUDA 12 pipeline:
In CUDA 11, cudart comes from the |
Hmm, it seems like the behavior is different in an unexpected way since the package for CUDA 11 does not add In any case, the thing I really want is a conda environment with Testing with:
I can see I am getting "cpu" builds of
|
You're right. I missed that we didn't add I see two ways out
I am in favor of 1 too but I'd like to ask our UCX expert @pentschev and @conda-forge/ucx-split too. |
IIRC UCX uses We softened Potentially we could soften the Static linking in the past led to some unpleasant issues. Believe these got fixed with PR ( openucx/ucx#6038 ). That said, @pentschev would likely know the implications of static linking here better than I |
Just for the sake of completeness, UCX does not support static linkage to cudart.
This is correct.
Making
We recently asked about this offline and the answer we got from UCX devs is that this isn't supported, so we shouldn't be doing that. |
I would certainly appreciate the flexibility to opt-out. I agree many users likely would likely miss the opportunity to opt-in. Could it make sense to provide a cpu only variant of
FWIW, I don't think I can do much with the GPU capabilities in UCX on a system that does not support infiniband/ibverbs. I would also guess due to the existence "_cpu" and "_cuda" builds of |
Any CUDA memory transfers over UCX, including TCP/shared memory, you still need UCX to be able to |
If by this @jakirkham meant to add |
Yep exactly |
As to CUDA and CPU variants of the package, we could do this and have done this in the past. It is a bit of a mixed bag. At the end of the day a user still needs to know to install the CUDA variant Would lean towards keeping one package and softening the dependency In terms of communication, maybe we can add an install time message and a README in the repo |
We can additionally add post link message like what we did in Open MPI: |
Interesting I missed the link on |
Yeah a post-link script is also an option. So just a question of when the message is emitted and whether a script is used Right No strong feelings amongst these. There is an interest in cutting down the number of scripts in packages that run at install time. So |
Thanks all, I'll integrate all this into #173 and ping you for review tonight or tomorrow. |
(I wonder why there's no |
More details in issue: conda/conda#10118 Once a package is installed, the horse is out of the barn Potentially there can be other cases where it may make sense to have a message after. Though having a message at the end implies this is not a big deal and users can figure things out if needed |
Looks like packages are up, but may still be mirroring to CDN Please let us know how things go |
Thanks all, I appreciate the help sorting this out. A quick look at the env with my simplified test looks good to me:
I also looked at the actual env that I wanted which specified |
Yeah something weird is happening there Raised upstream issue: regro/cf-scripts#2519 |
Solution to issue cannot be found in the documentation.
Issue
I'm trying to create a conda environment with dask and cupy-core. I'm relying on a site installation to provide the cuda libraries needed at runtime so I want to avoid having cuda-cudart installed in the conda environment. cupy-core has a dependency on cuda-version which seems fine but when I add dask to the environment it ends up pulling in cuda-cudart from ucx via libarrow-flight/pyarrow/dask.
I think the following is sufficient to reproduce my issue (although the thing I actually care about is dask + cupy-core):
Installing ucx on its own does not pull in cuda-cudart so it seems like the presence of the cuda-version package is triggering this behavior. I tried to specify a cpu-only build of ucx with "ucx=*=*cpu" but the ucx version (1.6.1) seems a lot older than the latest available version so I suspect that may be old.
(I'm not sure this is a ucx problem, I suppose it could be related to how cupy-core or libarrow conda dependencies are specified but I figured I'd start here.)
Installed packages
Environment info
The text was updated successfully, but these errors were encountered: