Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmod-5.10-nvidia: fix tmpfilesd configurations #2020

Merged

Conversation

arnaldo2792
Copy link
Contributor

Issue number:
N / A

Description of changes:

Now the directory to store the NVIDIA kernel module is created using the
full path with `PREFIX`, instead of the symliked directory `/lib/modules`

Testing done:
Launched a host with GPUs and aws-k8s-1.21-nvidia, I verified the kernel modules were linked and loaded:

bash-5.0# systemctl status link-kernel-modules.service
● link-kernel-modules.service - Link additional kernel modules
     Loaded: loaded (/x86_64-bottlerocket-linux-gnu/sys-root/usr/lib/systemd/system/link-kernel-modules.service; enabled; vendor preset: enabled)
     Active: active (exited) since Thu 2022-03-24 03:57:37 UTC; 56min ago
   Main PID: 2612 (code=exited, status=0/SUCCESS)

Mar 24 03:57:36 localhost systemd[1]: Starting Link additional kernel modules...
Mar 24 03:57:36 localhost driverdog[2612]: 03:57:36 [INFO] Linked object 'nvidia-modeset.o'
Mar 24 03:57:36 localhost driverdog[2612]: 03:57:36 [INFO] Stripped object 'nvidia-modeset.o'
Mar 24 03:57:37 localhost driverdog[2612]: 03:57:37 [INFO] Linked object 'nvidia.o'
Mar 24 03:57:37 localhost driverdog[2612]: 03:57:37 [INFO] Stripped object 'nvidia.o'
Mar 24 03:57:37 localhost driverdog[2612]: 03:57:37 [INFO] Linked nvidia-modeset.ko
Mar 24 03:57:37 localhost driverdog[2612]: 03:57:37 [INFO] Linked nvidia-uvm.ko
Mar 24 03:57:37 localhost driverdog[2612]: 03:57:37 [INFO] Linked nvidia.ko
Mar 24 03:57:37 localhost systemd[1]: Finished Link additional kernel modules.

bash-5.0# lsmod | grep nvidia
nvidia_modeset       1159168  0
nvidia_uvm           1138688  0
nvidia              34799616  2 nvidia_uvm,nvidia_modeset
drm                   606208  1 nvidia
i2c_core               98304  2 nvidia,drm

bash-5.0# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "9672f002",
    "pretty_name": "Bottlerocket OS 1.6.2 (aws-k8s-1.21-nvidia)",
    "variant_id": "aws-k8s-1.21-nvidia",
    "version_id": "1.6.2"
  }
}

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Copy link
Contributor

@webern webern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix. Is there a check that could be added that would have caught the issue at build time instead of runtime?

@arnaldo2792
Copy link
Contributor Author

Nice fix. Is there a check that could be added that would have caught the issue at build time instead of runtime?

I don't think so 😞, we didn't have anything wrong in the configuration file, it just so happened that what we were doing didn't comply anymore with how tmpfilesd behaves.

@@ -84,7 +84,8 @@ install -d %{buildroot}%{_cross_unitdir}
install -d %{buildroot}%{_cross_factorydir}%{_cross_sysconfdir}/{drivers,ld.so.conf.d}

KERNEL_VERSION=$(cat %{kernel_sources}/include/config/kernel.release)
sed -e "s|__KERNEL_VERSION__|${KERNEL_VERSION}|" %{S:200} > nvidia.conf
sed -e "s|__KERNEL_VERSION__|${KERNEL_VERSION}|" %{S:200} | sed -e "s|__PREFIX__|%{_cross_prefix}|" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you should be able to combine both edits in one command:

Suggested change
sed -e "s|__KERNEL_VERSION__|${KERNEL_VERSION}|" %{S:200} | sed -e "s|__PREFIX__|%{_cross_prefix}|" \
sed \
-e "s|__KERNEL_VERSION__|${KERNEL_VERSION}|" %{S:200} \
-e "s|__PREFIX__|%{_cross_prefix}|" \

Now the directory to store the NVIDIA kernel module is created using the
full path with `PREFIX`, instead of the symliked directory `/lib/modules`

Signed-off-by: Arnaldo Garcia Rincon <agarrcia@amazon.com>
@arnaldo2792
Copy link
Contributor Author

(Fixed nit in spec file)

@arnaldo2792 arnaldo2792 merged commit c86796c into bottlerocket-os:develop Mar 24, 2022
@arnaldo2792 arnaldo2792 deleted the fix-kmod-5.10-nvidia branch March 31, 2022 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants