diff --git a/container-toolkit/docker-specialized.md b/container-toolkit/docker-specialized.md index fab8d6674..f76f18cb0 100644 --- a/container-toolkit/docker-specialized.md +++ b/container-toolkit/docker-specialized.md @@ -175,7 +175,7 @@ For example, specify the `compute` and `utility` capabilities, allowing usage of > > ```console > $ docker run --rm --gpus 'all,"capabilities=compute,utility"' \ -> nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi +> nvidia/cuda:12.5.0-base-ubuntu22.04 nvidia-smi > ``` ### Constraints diff --git a/container-toolkit/supported-platforms.md b/container-toolkit/supported-platforms.md index ef04001db..68a88a29f 100644 --- a/container-toolkit/supported-platforms.md +++ b/container-toolkit/supported-platforms.md @@ -18,7 +18,6 @@ Recent NVIDIA Container Toolkit releases are tested and expected to work on thes | RHEL 8.x | X | X | X | | RHEL 9.x | X | X | X | | RHEL 10.x | X | X | X | -| Ubuntu 20.04 | X | X | X | | Ubuntu 22.04 | X | X | X | | Ubuntu 24.04 | X | | X | | Rocky Linux 9.7 | X | X | X | diff --git a/gpu-operator/amazon-eks.rst b/gpu-operator/amazon-eks.rst index 4eb7b7606..6ad0a3bec 100644 --- a/gpu-operator/amazon-eks.rst +++ b/gpu-operator/amazon-eks.rst @@ -214,5 +214,5 @@ Related Information * If you have an existing Amazon EKS cluster, you can refer to `Launching self-managed Amazon Linux nodes `_ in the Amazon EKS documentation to add a self-managed node group to your cluster. - However, all nodes in the cluster must run Ubuntu 20.04 or 22.04. + However, all nodes in the cluster must run a `supported operating system `_. This documentation includes steps for using the AWS Management Console. \ No newline at end of file diff --git a/gpu-operator/getting-started.rst b/gpu-operator/getting-started.rst index 7b418c169..8cf8776f1 100644 --- a/gpu-operator/getting-started.rst +++ b/gpu-operator/getting-started.rst @@ -473,7 +473,7 @@ In this scenario, the NVIDIA Container Toolkit is already installed on the worke Running a Custom Driver Image ============================= -If you want to use custom driver container images, such as version 465.27, then +If you want to use custom driver container images, such as version 580.126.20, then you can build a custom driver container image. Follow these steps: - Rebuild the driver container by specifying the ``$DRIVER_VERSION`` argument when building the Docker image. For @@ -483,8 +483,8 @@ you can build a custom driver container image. Follow these steps: .. code-block:: console $ docker build --pull -t \ - --build-arg DRIVER_VERSION=455.28 \ - nvidia/driver:455.28-ubuntu20.04 \ + --build-arg DRIVER_VERSION=580.126.20 \ + nvidia/driver:580.126.20-ubuntu22.04 \ --file Dockerfile . Ensure that the driver container is tagged as shown in the example by using the ``driver:-`` schema. @@ -498,7 +498,7 @@ you can build a custom driver container image. Follow these steps: nvidia/gpu-operator \ --version=${version} \ --set driver.repository=docker.io/nvidia \ - --set driver.version="465.27" + --set driver.version="580.126.20" These instructions are provided for reference and evaluation purposes. Not using the standard releases of the GPU Operator from NVIDIA would mean limited @@ -647,7 +647,7 @@ In the first example, let's run a simple CUDA sample, which adds two vectors tog restartPolicy: OnFailure containers: - name: cuda-vectoradd - image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04" + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" resources: limits: nvidia.com/gpu: 1 diff --git a/gpu-operator/gpu-driver-configuration.rst b/gpu-operator/gpu-driver-configuration.rst index aa5d84077..ffe423389 100644 --- a/gpu-operator/gpu-driver-configuration.rst +++ b/gpu-operator/gpu-driver-configuration.rst @@ -71,8 +71,8 @@ Driver Daemon Sets The NVIDIA GPU Operator starts a driver daemon set for each NVIDIA driver custom resource and each operating system version. For example, if your cluster has one NVIDIA driver custom resource that specifies a 580 branch GPU driver and some -worker nodes run Ubuntu 20.04 and other worker nodes run Ubuntu 22.04, the Operator starts two driver daemon sets. -One daemon set configures the GPU driver on the Ubuntu 20.04 nodes and the other configures the driver on the Ubuntu 22.04 nodes. +worker nodes run Ubuntu 22.04 and other worker nodes run Ubuntu 24.04, the Operator starts two driver daemon sets. +One daemon set configures the GPU driver on the Ubuntu 22.04 nodes and the other configures the driver on the Ubuntu 24.04 nodes. All the nodes run the same 580 branch GPU driver. .. image:: graphics/nvd-basics.svg @@ -445,7 +445,7 @@ When you update the custom resource, the Operator performs a rolling update of t .. code-block:: output NAME READY STATUS RESTARTS AGE - nvidia-gpu-driver-ubuntu20.04-788484b9bb-6zhd9 1/1 Running 0 5m1s + nvidia-gpu-driver-ubuntu24.04-788484b9bb-6zhd9 1/1 Running 0 5m1s nvidia-gpu-driver-ubuntu22.04-8896c4bf7-7s68q 1/1 Terminating 0 37m nvidia-gpu-driver-ubuntu22.04-8896c4bf7-jm74l 1/1 Running 0 37m @@ -515,7 +515,7 @@ If the driver daemon sets and pods are not running as you expect, perform the fo .. code-block:: output NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE - nvidia-gpu-driver-ubuntu20.04-788484b9bb 1 1 1 1 1 driver.config=silver,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=20.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m + nvidia-gpu-driver-ubuntu24.04-788484b9bb 1 1 1 1 1 driver.config=silver,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=24.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m nvidia-gpu-driver-ubuntu22.04-8896c4bf7 2 2 2 2 2 driver.config=gold,feature.node.kubernetes.io/system-os_release.ID=ubuntu,feature.node.kubernetes.io/system-os_release.VERSION_ID=22.04,nvidia.com/gpu.deploy.driver=true,nvidia.com/gpu.present=true 10m #. View the logs from the GPU Operator pod: diff --git a/gpu-operator/install-gpu-operator-air-gapped.rst b/gpu-operator/install-gpu-operator-air-gapped.rst index c4a1b23c9..5714da920 100644 --- a/gpu-operator/install-gpu-operator-air-gapped.rst +++ b/gpu-operator/install-gpu-operator-air-gapped.rst @@ -139,11 +139,11 @@ There is one caveat with regards to the driver image. The version field must be image: driver version: "${recommended}" -To pull the driver image for Ubuntu 20.04: +To pull the driver image for Ubuntu 22.04: .. code-block:: console - $ docker pull nvcr.io/nvidia/driver:${recommended}-ubuntu20.04 + $ docker pull nvcr.io/nvidia/driver:${recommended}-ubuntu22.04 To push the images to the local registry, simply tag the pulled images by prefixing the image with the image registry information. @@ -152,14 +152,14 @@ Using the above examples, this will result in: .. code-block:: console $ docker tag nvcr.io/nvidia/gpu-operator:${version} //gpu-operator:${version} - $ docker tag nvcr.io/nvidia/driver:${recommended}-ubuntu20.04 //driver:${recommended}-ubuntu20.04 + $ docker tag nvcr.io/nvidia/driver:${recommended}-ubuntu22.04 //driver:${recommended}-ubuntu22.04 Finally, push the images to the local registry: .. code-block:: console $ docker push //gpu-operator:${version} - $ docker push //driver:${recommended}-ubuntu20.04 + $ docker push //driver:${recommended}-ubuntu22.04 Update ``values.yaml`` with local registry information in the repository field. @@ -301,15 +301,15 @@ An example of repo list is shown below for Ubuntu 22.04 (access to local package deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu jammy-updates main universe deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu jammy-security main universe -An example of repo list is shown below for Ubuntu 20.04 (access to local package repository via HTTP): +An example of repo list is shown below for Ubuntu 24.04 (access to local package repository via HTTP): ``custom-repo.list``: .. code-block:: - deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu focal main universe - deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu focal-updates main universe - deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu focal-security main universe + deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu noble main universe + deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu noble-updates main universe + deb [arch=amd64] http:///ubuntu/mirror/archive.ubuntu.com/ubuntu noble-security main universe An example of repo list is shown below for CentOS 8 (access to local package repository via HTTP): diff --git a/gpu-operator/manifests/input/time-slicing-verification.yaml b/gpu-operator/manifests/input/time-slicing-verification.yaml index 1f3d726f6..3f36f3451 100644 --- a/gpu-operator/manifests/input/time-slicing-verification.yaml +++ b/gpu-operator/manifests/input/time-slicing-verification.yaml @@ -21,7 +21,7 @@ spec: hostPID: true containers: - name: cuda-sample-vector-add - image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda11.7.1-ubuntu20.04" + image: "nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubuntu22.04" command: ["/bin/bash", "-c", "--"] args: - while true; do /cuda-samples/vectorAdd; done diff --git a/gpu-operator/platform-support.rst b/gpu-operator/platform-support.rst index 93edc6fe1..1ebd76788 100644 --- a/gpu-operator/platform-support.rst +++ b/gpu-operator/platform-support.rst @@ -319,16 +319,6 @@ The GPU Operator has been validated in the following scenarios: - | Nutanix | NKP - * - Ubuntu 20.04 LTS |fn2|_ - - 1.32---1.35 - - - - 1.32---1.35 - - 1.32---1.35 - - - - - - - - - * - Ubuntu 22.04 LTS |fn2|_ - 1.32---1.35 - @@ -416,7 +406,6 @@ The GPU Operator has been validated in the following scenarios: :sup:`2` For Ubuntu 22.04 LTS, kernel versions 6.8 (non-precompiled driver containers only) 6.5 and 5.15 are LTS ESM kernels. - For Ubuntu 20.04 LTS, kernel versions 5.4 and 5.15 are LTS ESM kernels. The GPU Driver containers support these Linux kernels. Refer to the Kernel release schedule on Canonical's `Ubuntu kernel lifecycle and enablement stack `_ page for more information. @@ -447,10 +436,6 @@ The GPU Operator has been validated in the following scenarios: - | Google GKE | Kubernetes - * - Ubuntu 20.04 LTS - - 1.32---1.35 - - 1.32---1.35 - * - Ubuntu 22.04 LTS - 1.32---1.35 - 1.32---1.35 @@ -491,8 +476,6 @@ The GPU Operator has been validated for the following container runtimes: +----------------------------+------------------------+----------------+ | Operating System | Containerd 1.7 - 2.2 | CRI-O | +============================+========================+================+ -| Ubuntu 20.04 LTS | Yes | Yes | -+----------------------------+------------------------+----------------+ | Ubuntu 22.04 LTS | Yes | Yes | +----------------------------+------------------------+----------------+ | Ubuntu 24.04 LTS | Yes | Yes | @@ -524,7 +507,6 @@ Operating System Kubernetes KubeVirt OpenShift Virtual ================ =========== ============= ========= ============= =========== Ubuntu 24.04 LTS 1.32---1.35 0.36+ Ubuntu 22.04 LTS 1.32---1.35 0.36+ 0.59.1+ -Ubuntu 20.04 LTS 1.32---1.35 0.36+ 0.59.1+ Red Hat Core OS 4.17---4.21 4.17---4.21 ================ =========== ============= ========= ============= =========== @@ -571,7 +553,7 @@ Supported operating systems and NVIDIA GPU Drivers with GPUDirect RDMA. - Ubuntu 22.04 LTS with Network Operator 25.10.0 - RHEL 8 with Network Operator 25.7.0. - Ubuntu 24.04 LTS with Network Operator 25.7.0. -- Ubuntu 20.04 and 22.04 LTS with Network Operator 25.7.0. +- Ubuntu 22.04 LTS with Network Operator 25.7.0. - Red Hat Enterprise Linux 9.2, 9.4, and 9.6 with Network Operator 25.7.0. - Red Hat OpenShift 4.17 and higher with Network Operator 25.7.0. - Ubuntu 24.04 LTS with Network Operator 25.10.0 @@ -585,7 +567,7 @@ Support for GPUDirect Storage Supported operating systems and NVIDIA GPU Drivers with GPUDirect Storage. - Ubuntu 24.04 LTS Network Operator 25.7.0. -- Ubuntu 20.04 and 22.04 LTS with Network Operator 25.7.0. +- Ubuntu 22.04 LTS with Network Operator 25.7.0. - Red Hat OpenShift Container Platform 4.17 and higher. .. note::