Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix nvidia-k8s-device-plugin linker flags #3924

Merged
merged 2 commits into from
Apr 27, 2024

Conversation

bcressey
Copy link
Contributor

Issue number:
N/A

Description of changes:
The flags to compile the nvidia-k8s-device-plugin have to be adjusted for the new SDK.

Specifically, the default GOLDFLAGS now includes -extldflags set to the stock LDFLAGS value. Builds that override CGO_LDFLAGS also need to override GOLDFLAGS so that the linker flags match.

Testing done:
Before:

bash-5.1# systemctl status nvidia-k8s-device-plugin
● nvidia-k8s-device-plugin.service - Start NVIDIA kubernetes device plugin
     Loaded: loaded (/aarch64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-k8s-device-plugin.service; enabled; preset: enabled)
     Active: activating (auto-restart) (Result: exit-code) since Sat 2024-04-27 15:40:42 UTC; 1s ago
    Process: 6080 ExecStart=/usr/bin/nvidia-device-plugin --device-list-strategy volume-mounts --device-id-strategy index --pass-device-specs=true (code=exited, status=127)
   Main PID: 6080 (code=exited, status=127)
        CPU: 2ms

bash-5.1# /usr/bin/nvidia-device-plugin --device-list-strategy volume-mounts --device-id-strategy index --pass-device-specs=true
/usr/bin/nvidia-device-plugin: symbol lookup error: /usr/bin/nvidia-device-plugin: undefined symbol: nvmlGpuInstanceGetComputeInstanceProfileInfoV
bash-5.1# systemctl status nvidia-k8s-device-plugin 

After:

bash-5.1# systemctl status nvidia-k8s-device-plugin
● nvidia-k8s-device-plugin.service - Start NVIDIA kubernetes device plugin
     Loaded: loaded (/aarch64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/nvidia-k8s-device-plugin.service; enabled; preset: enabled)
     Active: active (running) since Sat 2024-04-27 15:50:30 UTC; 3min 22s ago
   Main PID: 1390 (nvidia-device-p)
      Tasks: 9 (limit: 9232)
     Memory: 56.6M
        CPU: 5.956s
     CGroup: /system.slice/nvidia-k8s-device-plugin.service
             └─1390 /usr/bin/nvidia-device-plugin --device-list-strategy volume-mounts --device-id-strategy index --pass-device-specs=true

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

`--export-dynamic` is a separate option, independent of `-z`, so pass
it as its own `-Wl` flag.

For consistency, pass the same flags to the external linker.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Signed-off-by: Ben Cressey <bcressey@amazon.com>
@bcressey bcressey merged commit a4ddf81 into bottlerocket-os:develop Apr 27, 2024
33 checks passed
@bcressey bcressey deleted the device-plugin-fix branch April 27, 2024 17:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants