A Compute node with four NVIDIA-A100 GPUs and one Mellanox InfiniBand adapter is part of an OpenStack Platform deployment.
The compute node has been configured to provide GPUs in passthrough to the VMs. The schedule for creating a VM with the GPU in passthrough is successful; however, the VM creation failed on the first attempt.
On the GPU Compute node, we found the following error message in /var/log/containers/nova/nova-compute.log:
2021-07-05 18:32:00.954 7 ERROR nova.compute.manager [instance: 6c1c1fec-8da7-41cc-809f-069fb3dc49ed] 2021-09-08T11:01:26.416056Z qemu-kvm: -device vfio-pci,host=0000:2f:00.0,id=hostdev0,bus=pci.0,addr=0x5: vfio 0000:2f:00.0: group 19 is not viable
2021-09-08 13:02:00.954 7 ERROR nova.compute.manager [instance: 6c1c1fec-8da7-41cc-809f-069fb3dc49ed] Please ensure all devices within the iommu_group are bound to their vfio bus driver.
We discovered that GPU0 (PCI device 0000:2f:00.0) is in the same IOMMU group 19 as the InfiniBand device on the compute host (PCI device 0000:25:00.0).
If our understanding is correct, PCI devices in the same IOMMU group must be assigned to the host or to a single guest VM in a block.
In fact, after unbinding the InfiniBand device from its driver on the host, we successfully created the instance.
echo -n "0000:25:00.0" > /sys/bus/pci/drivers/mlx5_core/unbind
A possible solution is given below.
sudo driverctl set-override 0000:25:00.0 vfio-pci
driverctl set-override 0000:25:00.0 pci-stub