Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add security guidance for NVIDIA GPU limits for unprivileged containers #4205

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 46 additions & 14 deletions SECURITY_GUIDANCE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,21 @@ Bottlerocket adheres to the [Shared Responsibility Model](https://aws.amazon.com

We provide these recommendations, along with [details](#details) and [examples](#examples), to help you create a configuration that meets your security and compliance requirements.

| Recommendation | Priority |
| :-------------------------------------------------------------------------------------------------- | :-------- |
| [Enable automatic updates](#enable-automatic-updates) | Critical |
| [Avoid containers with elevated privileges](#avoid-containers-with-elevated-privileges) | Critical |
| [Restrict access to the host API socket](#restrict-access-to-the-host-api-socket) | Critical |
| [Restrict access to the container runtime socket](#restrict-access-to-the-container-runtime-socket) | Critical |
| [Design for host replacement](#design-for-host-replacement) | Important |
| [Enable kernel lockdown](#enable-kernel-lockdown) | Important |
| [Limit use of host containers](#limit-use-of-host-containers) | Important |
| [Limit use of privileged SELinux labels](#limit-use-of-privileged-selinux-labels) | Important |
| [Limit access to system mounts](#limit-access-to-system-mounts) | Important |
| [Limit access to host namespaces](#limit-access-to-host-namespaces) | Important |
| [Limit access to block devices](#limit-access-to-block-devices) | Important |
| [Do not run containers as UID 0](#do-not-run-containers-as-uid-0) | Moderate |
| Recommendation | Priority |
| :-------------------------------------------------------------------------------------------------- | :-------- |
| [Enable automatic updates](#enable-automatic-updates) | Critical |
| [Avoid containers with elevated privileges](#avoid-containers-with-elevated-privileges) | Critical |
| [Restrict access to the host API socket](#restrict-access-to-the-host-api-socket) | Critical |
| [Restrict access to the container runtime socket](#restrict-access-to-the-container-runtime-socket) | Critical |
| [Design for host replacement](#design-for-host-replacement) | Important |
| [Enable kernel lockdown](#enable-kernel-lockdown) | Important |
| [Limit use of host containers](#limit-use-of-host-containers) | Important |
| [Limit use of privileged SELinux labels](#limit-use-of-privileged-selinux-labels) | Important |
| [Limit access to system mounts](#limit-access-to-system-mounts) | Important |
| [Limit access to host namespaces](#limit-access-to-host-namespaces) | Important |
| [Limit access to block devices](#limit-access-to-block-devices) | Important |
| [Enforce requested NVIDIA GPU limits for unprivileged containers](#enforce-requested-nvidia-gpu-limits-for-unprivileged-containers) | Important |
| [Do not run containers as UID 0](#do-not-run-containers-as-uid-0) | Moderate |

## Details

Expand Down Expand Up @@ -228,6 +229,37 @@ This could compromise the integrity of the host.

We recommend limiting access to block devices.

### Enforce requested NVIDIA GPU limits for unprivileged containers

When launching a container that has requested NVIDIA GPUs, the host software responsible for adding the devices to the container - the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html) - will use one of these sources to determine which devices should be added:

* The `NVIDIA_VISIBLE_DEVICES` environment variable
* Mounts configured by the [NVIDIA Kubernetes Device Plugin](https://github.com/NVIDIA/k8s-device-plugin)

If `NVIDIA_VISIBLE_DEVICES="all"` is set in a container’s environment, it can gain access to all NVIDIA GPUs on the system regardless of the NVIDIA GPU limits requested through Kubernetes directives.
Because most popular container base images are configured this way, respecting this value by default would grant unprivileged containers access to all NVIDIA GPUs, ignoring the requested limits.

To prevent this, Bottlerocket configures the host software so that `NVIDIA_VISIBLE_DEVICES="all"` is only respected for privileged containers.

If you need to grant unprivileged containers access to all NVIDIA GPUs using this environment variable - bypassing any requested GPU limits - you can apply these settings:


```toml
[settings.kubelet-device-plugin]
# Configures NVIDIA_VISIBLE_DEVICES with the list of devices
device-list-strategy = "envvar"

[settings.nvidia-container-runtime]
# Allows reading the devices from NVIDIA_VISIBLE_DEVICES
visible-devices-as-volume-mounts = false

# Allows granting access to all unprivileged
# containers with NVIDIA_VISIBLE_DEVICES=all
visible-devices-envvar-when-unprivileged = true
```

We recommend leaving these settings at the default values, which will enforce the requested NVIDIA GPU limits for unprivileged containers.

### Do not run containers as UID 0

Bottlerocket does not currently support user namespaces.
Expand Down