VFIO Woes

14/Jan/23 Derek Jarvis

The saying "Newer is not always better" definitely applies when working with VFIO device passthrough. With everything still being developed and standards being formed, it's pretty common for new versions and minor changes to shatter your Jenga tower. I guess it can't be the bleeding edge without some blood.

Background

I have a home lab with a Ryzen 3700X and 64 GB of RAM that has been running many virtual machines (including two gaming PCs) for quite some time. I got a nice upgrade for christmas: a Ryzen 5950X and 128 GB of RAM. DOUBLE the cores, and DOUBLE the memory.

At first I had some booting trouble because the BIOS on my x570 Tuf Gaming motherboard had never been upgraded, so it wouldn't work with the latest processors. Once I got the new BIOS in place the real fun started.

I decided to upgrade ProxMox to the latest - I went from 6.3 to 7.3, and opted for the newest kernel: 6.1. Everything was booting fine, except my windows guest. I read some mentions of issues with Windows guests and special steps that needed to be taken. I decided to take my chances, because the guest was in need of a fresh install anyway. After the ProxMox upgrade, the guest wouldn't boot anymore and I was stuck in UEFI hell. After an evening of attempting windows repairs and other fixes, I just made a new machine.

So now I have a fresh machine and things should work well, right? WRONG.

I had issues with passing my GPU through to the guest. My old approach of blacklisting drivers and assigning PCI IDs to vfio weren't working anymore. There was some mention on forums that those hacks were no longer needed, and the kernel and display drivers could now smoothly pass devices over to VFIO to use on the guest.

I spent a lot of time (and 50 reboots) jumping between all the settings and troubleshooting various things. In the end, it turns out that it wasn't VFIO, but the host firmware being used. Once I switched my machine from q35-7.1 to q35-7.0 things started working. From there, I did more testing to find the optimum configuration.

End State

With these settings I am able to pass all 3 of my GPUs to guest machines. I am able to restart machines that have unexpected shutdowns because vendor-reset fixes power cycling issues with the older RX cards. The only missing piece is that the primary GPU, the one the system uses to display boot info to, cannot properly detect the dummy HDMI plugs I have installed. I am currently working around that by using this virtual display driver.

vfio.config

options kvm ignore_msrs=1
softdep amdgpu pre: vfio vfio_pci

Since there are no MSRS errors in the logs, I could probably get rid of ignore_msrs, but it's working so I'm leaving it alone.

blacklist

none!

grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet hugepagesz=1GB hugepages=1 iommu=pt pci=noaer nomodeset initcall_blacklist=sysfb_init"

modules

# for fixing GPUs
vendor-reset

# vfio
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

GPU start script (stolen from some forum)

#!/bin/bash
echo device_specific > /sys/bus/pci/devices/0000:05:00.0/reset_method
echo device_specific > /sys/bus/pci/devices/0000:06:00.0/reset_method
echo device_specific > /sys/bus/pci/devices/0000:0c:00.0/reset_method

Errors you can ignore

There are a number of errors, warnings, and messages that look like a problem but are just distractions. Even with a system that is fully working (except the primary HDMI plugs) some of these errors are still present.

AMD-Vi: Completion-Wait loop timed outiommu ivhd0: AMD-Vi: Event logged [IOTLB_INV_TIMEOUT device=0000:05:00.0 address=0x10022ebc0]
These errors seems to appear when vendor-reset is trying to reset the primary GPU.

vfio_ecap_init: hiding ecap 0x19@0x270This message happens all the time, even with the fully working video cards. It seems safe to ignore.

proxmox kernel: [drm:detect_link_and_local_sink [amdgpu]] *ERROR* No EDID read. proxmox kernel: amdgpu 0000:05:00.0: [drm] Cannot find any crtc or sizes
These messages happen when amdgpu DC is running and failing to configure the HDMI ports on the primary GPU.

I have tried working around this error with things like disabling amdgpu's display configuration (amdgpu.dc=0) and hard-coding EDIDs (drm.edid_firmware=edid/1280x1024.bin), but those fixes did not work.

Using nomodeset helps these messages go away, but doesn't make the HDMI plug in the primary GPU usable by the guest.

Errors that mean it's not going to work

vfio-pci 0000:05:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0025 address=0x11f82b200 flags=0x0000] amd_iommu_report_page_fault: 1854 callbacks suppressed
AMD-Vi: IOMMU event log overflow
These errors mean there's something seriously wrong with VFIO trying to access the device and it's not going to work.

Wrapping Up

It was an annoying but rewarding experience. I'm pretty sure I leveled up my VFIO skills. My primary reason for writing this blog is the hope that it will help someone, somewhere, facing similar issues.