Home / “Insufficient compute resources: vGPU resource is not available”

“Insufficient compute resources: vGPU resource is not available”

 

The error "Insufficient compute resources: vGPU resource is not accessible" appears when you launch an OpenStack instance with an NVIDIA GRID Virtual GPU attached, as shown below.

nova-conductor.log:2022-09-07 20:59:05.993 21 WARNING nova.scheduler.utils [req-a1538494-34e9-4e26-b004-834f5ce49542 98cb30a967c144feba7d98c99fa637db 596a4c7b716046cb8fe6dd0b6b9a9ebc - default default] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance f4099be3-698f-4c50-bcb1-8717f1aa1b7b. Last exception: Insufficient compute resources: vGPU resource is not available.: nova.exception.MaxRetriesExceeded: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance f4099be3-698f-4c50-bcb1-8717f1aa1b7b. Last exception: Insufficient compute resources: vGPU resource is not available.

Rent GPU servers with professional-grade NVIDIA Ampere A100 | RTX A6000 | GFORCE RTX 3090 | GEFORCE RTX 1080Ti cards. Linux and Windows VPS are also available at Seimaxim.

To resolve the above issue verify that nova.conf /var/lib/config-data/puppet-generated/nova_libvirt/etc/nova/nova.conf has the correct NVIDIA device type set.

[devices]
enabled_vgpu_types=nvidia-224

Restart nova_libvirt and nova_compute containers on compute node.

systemctl restart tripleo_nova_libvirt.service
systemctl restart tripleo_compute.service

"enabled_vgpu_types" in OpenStack Platform 16 only supports 1 value. After the initial value, all further values are disregarded.

On a compute node, all other vgpu device types will be unable to be created if a vgpu of one type is already present.

You can use the commands below to view the vgpu devices that are available from a compute node.

$ for tmp in $(podman exec -it nova_libvirt virsh nodedev-list --cap mdev_types); do echo "${tmp}"; podman exec -it nova_libvirt virsh nodedev-dumpxml ${tmp}; done
# This will give us again the list of physical GPUs and their available instances per type.
From a compute node you can view the currently created vGPU devices.
for tmp in $(podman exec -it nova_libvirt virsh nodedev-list --cap mdev); do echo "${tmp}"; podman exec -it nova_libvirt virsh nodedev-dumpxml ${tmp}; done

Leave a Reply