Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion container-toolkit/docker-specialized.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ For example, specify the `compute` and `utility` capabilities, allowing usage of
>
> ```console
> $ docker run --rm --gpus 'all,"capabilities=compute,utility"' \
> nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
> nvidia/cuda:12.5.0-base-ubuntu22.04 nvidia-smi
> ```

### Constraints
Expand Down
1 change: 0 additions & 1 deletion container-toolkit/supported-platforms.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ Recent NVIDIA Container Toolkit releases are tested and expected to work on thes
| RHEL 8.x | X | X | X |
| RHEL 9.x | X | X | X |
| RHEL 10.x | X | X | X |
| Ubuntu 20.04 | X | X | X |
| Ubuntu 22.04 | X | X | X |
| Ubuntu 24.04 | X | | X |
| Rocky Linux 9.7 | X | X | X |
Expand Down
2 changes: 1 addition & 1 deletion gpu-operator/amazon-eks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,5 +214,5 @@ Related Information
* If you have an existing Amazon EKS cluster, you can refer to
`Launching self-managed Amazon Linux nodes <https://docs.aws.amazon.com/eks/latest/userguide/launch-workers.html>`_
in the Amazon EKS documentation to add a self-managed node group to your cluster.
However, all nodes in the cluster must run Ubuntu 20.04 or 22.04.
However, all nodes in the cluster must run a `supported operating system <platform-support.html?category=cloud-service-providers#container-platforms>`_.
This documentation includes steps for using the AWS Management Console.
10 changes: 5 additions & 5 deletions gpu-operator/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -473,7 +473,7 @@ In this scenario, the NVIDIA Container Toolkit is already installed on the worke
Running a Custom Driver Image
=============================

If you want to use custom driver container images, such as version 465.27, then
If you want to use custom driver container images, such as version 580.126.20, then
you can build a custom driver container image. Follow these steps:

- Rebuild the driver container by specifying the ``$DRIVER_VERSION`` argument when building the Docker image. For
Expand All @@ -483,8 +483,8 @@ you can build a custom driver container image. Follow these steps:
.. code-block:: console

$ docker build --pull -t \
--build-arg DRIVER_VERSION=455.28 \
nvidia/driver:455.28-ubuntu20.04 \
--build-arg DRIVER_VERSION=580.126.20 \
nvidia/driver:580.126.20-ubuntu22.04 \
--file Dockerfile .

Ensure that the driver container is tagged as shown in the example by using the ``driver:<version>-<os>`` schema.
Expand All @@ -498,7 +498,7 @@ you can build a custom driver container image. Follow these steps:
nvidia/gpu-operator \
--version=${version} \
--set driver.repository=docker.io/nvidia \
--set driver.version="465.27"
--set driver.version="580.126.20"

These instructions are provided for reference and evaluation purposes.
Not using the standard releases of the GPU Operator from NVIDIA would mean limited
Expand Down Expand Up @@ -647,7 +647,7 @@ In the first example, let's run a simple CUDA sample, which adds two vectors tog
restartPolicy: OnFailure
containers:
- name: cuda-vectoradd
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
resources:
limits:
nvidia.com/gpu: 1
Expand Down
8 changes: 4 additions & 4 deletions gpu-operator/gpu-driver-configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ Driver Daemon Sets
The NVIDIA GPU Operator starts a driver daemon set for each NVIDIA driver custom resource and each operating system version.

For example, if your cluster has one NVIDIA driver custom resource that specifies a 580 branch GPU driver and some
worker nodes run Ubuntu 20.04 and other worker nodes run Ubuntu 22.04, the Operator starts two driver daemon sets.
One daemon set configures the GPU driver on the Ubuntu 20.04 nodes and the other configures the driver on the Ubuntu 22.04 nodes.
worker nodes run Ubuntu 22.04 and other worker nodes run Ubuntu 24.04, the Operator starts two driver daemon sets.
One daemon set configures the GPU driver on the Ubuntu 22.04 nodes and the other configures the driver on the Ubuntu 24.04 nodes.
All the nodes run the same 580 branch GPU driver.

.. image:: graphics/nvd-basics.svg
Expand Down Expand Up @@ -445,7 +445,7 @@ When you update the custom resource, the Operator performs a rolling update of t
.. code-block:: output

NAME READY STATUS RESTARTS AGE
nvidia-gpu-driver-ubuntu20.04-788484b9bb-6zhd9 1/1 Running 0 5m1s
nvidia-gpu-driver-ubuntu24.04-788484b9bb-6zhd9 1/1 Running 0 5m1s
nvidia-gpu-driver-ubuntu22.04-8896c4bf7-7s68q 1/1 Terminating 0 37m
nvidia-gpu-driver-ubuntu22.04-8896c4bf7-jm74l 1/1 Running 0 37m

Expand Down Expand Up @@ -515,7 +515,7 @@ If the driver daemon sets and pods are not running as you expect, perform the fo
.. code-block:: output

NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
nvidia-gpu-driver-ubuntu20.04-788484b9bb 1 1 1 1 1 driver.config=silver,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=20.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m
nvidia-gpu-driver-ubuntu24.04-788484b9bb 1 1 1 1 1 driver.config=silver,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=24.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m
nvidia-gpu-driver-ubuntu22.04-8896c4bf7 2 2 2 2 2 driver.config=gold,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m

#. View the logs from the GPU Operator pod:
Expand Down
16 changes: 8 additions & 8 deletions gpu-operator/install-gpu-operator-air-gapped.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,11 +139,11 @@ There is one caveat with regards to the driver image. The version field must be
image: driver
version: "${recommended}"

To pull the driver image for Ubuntu 20.04:
To pull the driver image for Ubuntu 22.04:

.. code-block:: console

$ docker pull nvcr.io/nvidia/driver:${recommended}-ubuntu20.04
$ docker pull nvcr.io/nvidia/driver:${recommended}-ubuntu22.04

To push the images to the local registry, simply tag the pulled images by prefixing the image with the image registry information.

Expand All @@ -152,14 +152,14 @@ Using the above examples, this will result in:
.. code-block:: console

$ docker tag nvcr.io/nvidia/gpu-operator:${version} <local-registry>/<local-path>/gpu-operator:${version}
$ docker tag nvcr.io/nvidia/driver:${recommended}-ubuntu20.04 <local-registry>/<local-path>/driver:${recommended}-ubuntu20.04
$ docker tag nvcr.io/nvidia/driver:${recommended}-ubuntu22.04 <local-registry>/<local-path>/driver:${recommended}-ubuntu22.04

Finally, push the images to the local registry:

.. code-block:: console

$ docker push <local-registry>/<local-path>/gpu-operator:${version}
$ docker push <local-registry>/<local-path>/driver:${recommended}-ubuntu20.04
$ docker push <local-registry>/<local-path>/driver:${recommended}-ubuntu22.04

Update ``values.yaml`` with local registry information in the repository field.

Expand Down Expand Up @@ -301,15 +301,15 @@ An example of repo list is shown below for Ubuntu 22.04 (access to local package
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu jammy-updates main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu jammy-security main universe

An example of repo list is shown below for Ubuntu 20.04 (access to local package repository via HTTP):
An example of repo list is shown below for Ubuntu 24.04 (access to local package repository via HTTP):

``custom-repo.list``:

.. code-block::

deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu focal main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu focal-updates main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu focal-security main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu noble main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu noble-updates main universe
deb [arch=amd64] http://<local pkg repository>/ubuntu/mirror/archive.ubuntu.com/ubuntu noble-security main universe

An example of repo list is shown below for CentOS 8 (access to local package repository via HTTP):

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
hostPID: true
containers:
- name: cuda-sample-vector-add
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04"
image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04"
command: ["/bin/bash", "-c", "--"]
args:
- while true; do /cuda-samples/vectorAdd; done
Expand Down
22 changes: 2 additions & 20 deletions gpu-operator/platform-support.rst
Original file line number Diff line number Diff line change
Expand Up @@ -319,16 +319,6 @@ The GPU Operator has been validated in the following scenarios:
- | Nutanix
| NKP

* - Ubuntu 20.04 LTS |fn2|_
- 1.32---1.35
-
- 1.32---1.35
- 1.32---1.35
-
-
-
-

* - Ubuntu 22.04 LTS |fn2|_
- 1.32---1.35
-
Expand Down Expand Up @@ -416,7 +406,6 @@ The GPU Operator has been validated in the following scenarios:

:sup:`2`
For Ubuntu 22.04 LTS, kernel versions 6.8 (non-precompiled driver containers only) 6.5 and 5.15 are LTS ESM kernels.
For Ubuntu 20.04 LTS, kernel versions 5.4 and 5.15 are LTS ESM kernels.
The GPU Driver containers support these Linux kernels.
Refer to the Kernel release schedule on Canonical's
`Ubuntu kernel lifecycle and enablement stack <https://ubuntu.com/kernel/lifecycle>`_ page for more information.
Expand Down Expand Up @@ -447,10 +436,6 @@ The GPU Operator has been validated in the following scenarios:
- | Google GKE
| Kubernetes

* - Ubuntu 20.04 LTS
- 1.32---1.35
- 1.32---1.35

* - Ubuntu 22.04 LTS
- 1.32---1.35
- 1.32---1.35
Expand Down Expand Up @@ -491,8 +476,6 @@ The GPU Operator has been validated for the following container runtimes:
+----------------------------+------------------------+----------------+
| Operating System | Containerd 1.7 - 2.2 | CRI-O |
+============================+========================+================+
| Ubuntu 20.04 LTS | Yes | Yes |
+----------------------------+------------------------+----------------+
| Ubuntu 22.04 LTS | Yes | Yes |
+----------------------------+------------------------+----------------+
| Ubuntu 24.04 LTS | Yes | Yes |
Expand Down Expand Up @@ -524,7 +507,6 @@ Operating System Kubernetes KubeVirt OpenShift Virtual
================ =========== ============= ========= ============= ===========
Ubuntu 24.04 LTS 1.32---1.35 0.36+
Ubuntu 22.04 LTS 1.32---1.35 0.36+ 0.59.1+
Ubuntu 20.04 LTS 1.32---1.35 0.36+ 0.59.1+
Red Hat Core OS 4.17---4.21 4.17---4.21
================ =========== ============= ========= ============= ===========

Expand Down Expand Up @@ -571,7 +553,7 @@ Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA.
- Ubuntu 22.04 LTS with Network Operator 25.10.0
- RHEL 8 with Network Operator 25.7.0.
- Ubuntu 24.04 LTS with Network Operator 25.7.0.
- Ubuntu 20.04 and 22.04 LTS with Network Operator 25.7.0.
- Ubuntu 22.04 LTS with Network Operator 25.7.0.
- Red Hat Enterprise Linux 9.2, 9.4, and 9.6 with Network Operator 25.7.0.
- Red Hat OpenShift 4.17 and higher with Network Operator 25.7.0.
- Ubuntu 24.04 LTS with Network Operator 25.10.0
Expand All @@ -585,7 +567,7 @@ Support for GPUDirect Storage
Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage.

- Ubuntu 24.04 LTS Network Operator 25.7.0.
- Ubuntu 20.04 and 22.04 LTS with Network Operator 25.7.0.
- Ubuntu 22.04 LTS with Network Operator 25.7.0.
- Red Hat OpenShift Container Platform 4.17 and higher.

.. note::
Expand Down
Loading