Everything you wanted to know about SR-IOV in Hyper-V Part 8
This part of the series is all about determining why SR-IOV may not be operational. As you will discover, there are several reasons, some of them obvious if you’ve followed all the parts so far, some more subtle. By the end of this part, you will be an expert!
Assuming you have a switch in SR-IOV mode, and have enabled SR-IOV on a virtual network adapter, the most obvious place you will notice that SR-IOV isn’t working is in Hyper-V Manager after selecting the networking tab for a running virtual machine. (I love this panel – my favourite bit of Hyper-V Manager that I worked on for the Windows “8” release!)
I’ve already outlined dependencies from past posts. But let’s assume you haven’t heeded them and have done this on an older machine which isn’t SLAT capable, doesn’t have BIOS support for SR-IOV and doesn’t even have an SR-IOV capable network adapter, as for the following screenshot. The first clues will come from the Get-VMHost PowerShell cmdlet. In this case, IovSupportReasons property returned from the cmdlet is pretty verbose in outlining a number of issues.
Essentially you’re never going to get SR-IOV working on the above machine. So let’s move on…
The follow example is a machine which has chipset support, but the BIOS doesnt have support for SR-IOV. This is probably the most common error you will find on servers currently shipping, or if you were to install Windows Server “8” beta on a desktop class machine. The error specifically is the first entry which says “To use this SR-IOV on this system, the system BIOS must be updated to allow Windows to control PCI Express. Contact your system manufacturer for an update.”
Next, let’s assume that the machine has chipset support, the BIOS has SR-IOV support, and you’re using a NIC which is capable of SR-IOV, but it still isn’t working. In this case, Get-VMHost may return the following:
In addition, after a virtual network adapter is started (by changing the state of a virtual machine to running, or by toggling the IOVWeight property on a running virtual network adapter to a positive value in the range 1..100) the following may be logged in the event log indicating that the user of SR-IOV has been disabled by policy on this system.
This reason for this takes a little explaining. Even if the system manufacturer has made the necessary changes in their BIOS for the base functionality Windows requires to support SR-IOV, some chipset implementations have flaws in them. In some cases, system manufacturers may be able to work around the problem by a fix in firmware. This is not universally true, and it may be a case that it requires a revision to silicon that cannot be fixed by firmware alone (in other words, a revised motherboard). The result of the chipset flaws are such that it is possible for a guest operating system which has a VF assigned to cause the physical system to operate with reduced performance, or in the worst case cause it to crash.
If you are prepared to assign VFs only to “trusted” workloads in lieu of an updated BIOS with a workaround (assuming it is possible on your hardware), the following registry key can be added on the parent partition. IOVEnableOverride. Type DWORD. Value 1. Under HKLM\Software\Microsoft\Windows NT\CurrentVersion\Virtualization. The system should also be restarted after setting this key. (Technically you could restart the VMMS service and save/restore each running VM which has an IOVWeight set as well.)
On a restart, the following event will be logged on each startup. As long as you are comfortable and understand the potential risk involved, SR-IOV should now work on a system with this registry key set.
If your system manufacturer can work around the chipset flaw, and has provided a BIOS which incorporates a workaround, the registry key is not required, the event above will not be logged, and VFs can be securely assigned to virtual machine. In these cases, if a virtual machine with a virtual function assigned can trigger the conditions which would otherwise cause the symptoms previously described, Hyper-V will automatically remove the VF from the VM and let it continue running using software based networking. However, it should be noted that if there is a VM which is able to trigger one of the conditions, there is an extremely likely probability that the guest operating system is compromised and likely to crash very soon after. However, the remainder of the system including other running VMs will not be affected.
The next useful cmdlet is Get-NetAdapterSriov. This cmdlet gives a lot of useful information about the physical network adapter, assuming it supports SR-IOV.
It’s pretty telling that nothing was returned. A clear indication that there are no SR-IOV capable network adapters. Let’s instead run this on a machine which does have an SR-IOV capable network adapter.
The fact that something was returned indicates the network adapter is SR-IOV capable. Furthermore, looking at NumVFs, we can see that this adapter is working correctly and has available resources.
If you’ve created a virtual switch, the third useful cmdlet is Get-VMSwitch. Remember that to enable SR-IOV, the switch must be created in SR-IOV mode to start with. When SR-IOV is not available on the physical NIC, there are a number of properties which indicate why. IovVirtualFunctionCount and IovQueuePairCount will be zero. IovSupport will be false, and IovSupportReasons will list the reasons why.
First an example where the machine itself does not support SR-IOV, and the switch is bound to a network adapter which doesn’t support SR-IOV either.
Here’s an example where the machine does support SR-IOV, but the physical network adapter does not. IovSupportReasons is clear as to the cause of the problem, regardless of whether the virtual switch is created with SR-IOV enabled or not.
And another example where the machine supports SR-IOV, as does the physical network adapter, but the switch was not created in SR-IOV mode. This one is a bit more subtle to spot as IovSupport and IovSupportReasons indicate everything is OK. The property IovEnabled is False, hence IovVirtualFunctionCount is zero even though the physical NIC has resources potentially available.
On a “good” (well configured) machine, you will get very different results in these properties. Notice how there is a positive integer in IovVirtualFunctionCount, IovSupport is True, and IovSupportReasons has a single value in the array, “OK”.
The last cmdlet is Get-VMNetworkAdapter. This should be run against a running VMs network adapter. Here again is an example from a physical machine which does not support SR-IOV, and does not have an SR-IOV capable network adapter. Even though the IovWeight property is non-zero, note that IovQueuePairsAssigned and IovUsage are zero, and Status and StatusDescription contain a slew of reasons why the network adapter is degraded.
Here’s the same on a “good” machine for comparison. Notice that IovUsage is 1.
The above has covered the common cases, but there are slightly more subtle ones, those around when port policies have been applied. See if you can spot what’s wrong in the following output. In this case, the machine is fully capable of SR-IOV, the virtual switch is in SR-IOV mode, and the IovWeight has been set on the network adapter correctly. It’s none of the reasons described so far.
Unfortunately, the StatusDescription isn’t overly helpful in indicating the precise reason. In fact, for several technical reasons, this is something which is incredibly difficult to accurately provide, so is unlikely to change before final release. Instead, we need to look at the policies which have been applied. In this particular case, I enabled RouterGuard on the VM. When we apply policy which can only be enforced by the virtual switch, and not the physical NIC, we automatically disable the use of SR-IOV on the VM so that the policy can be applied. Turning off any such policies (assuming they are compatible with the networking configuration requirements of the VM) will enable SR-IOV to start operating again.
Now I did mention it in an earlier post, but if you are still struggling to get SR-IOV enabled and you believe you have everything you should need (chipset, latest BIOS, BIOS settings, NIC, virtual switch in SR-IOV mode), there is one other thing that is definitely worth checking. Some BIOS’s have more than one firmware setting to enable SR-IOV. If in doubt, always go back to your system manufacturers documentation to make sure you have the settings configured correctly. And remember, if you do change BIOS settings, you may need to hard power cycle the machine, not just a soft restart.
There are two other reasons worth mentioning. One is if you are using client Hyper-V. As this is a server only feature, the user interface for SR-IOV does not exist in Hyper-V Manager on client. (Note that the SR-IOV options will appear though if you are using Hyper-V Manager on a client connecting to a remote Windows “8” server.)
If you were to run get-vmhost on a client, it will indicate that SR-IOV is not supported.
And similarly for a virtual switch (sadly my laptop doesn’t have a 10G network adapter that supports SR-IOV either – next upgrade )
So that’s pretty much it in terms of diagnosing why SR-IOV may not be operating. If you understood all the above, you are now a fully-fledged superhero and have earned your cape with honours!
Probably one more part to come in this series, the “kitchen sink” part, as in everything not already mentioned. That will hopefully be early next week after I find time to write it.
Cheers,
John.
Comments
Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
Ramesh - I can't talk about futures. Networking is currently the only I/O class supported today. No, it is not possible to write your own PF/VF drivers for an arbitrary device class - you would require support from the VSP/VSC and API hooks to permit this, the same (or similar at least) way we have done for NDIS.Anonymous
January 01, 2003
ASUS KGPE-D16 latest BIOS (dual 6328's, ECC memory) and IOMMU enabled, Intel I350-T4 installed with latest Intel drivers, reg edit enabled and SRIOV / VMQ switch created but still VM (Storage Server 2012) Networking tab indicates degraded. Anyone have success with this ASUS board or AMD CPU's and SR-IOV?Anonymous
January 01, 2003
Ramesh - no, this is not possible.Anonymous
January 01, 2003
Mark - the group which does the work on the Linux Integration Services (LIS) for Hyper-V are looking at what it will take to support SR-IOV, but I cannot provide any timeframe for when it might be made available.Anonymous
January 01, 2003
The comment has been removedAnonymous
January 01, 2003
Andrey - this is a question for Cisco. It's possibly driver related. Thanks, John.Anonymous
January 01, 2003
Joshua - see this deployment guide for LBFO: www.microsoft.com/.../details.aspx Thanks, John.Anonymous
January 01, 2003
RadsKSA - what is the output of get-vmhost (see example above)?Anonymous
March 07, 2013
I have been implementing SR-IOV on a set of 4 Dell M420 blades using Broadcom 57810 CNAs on Windows Server 2012. The 4 servers are configured as a Hyper-V cluster. I am running a couple of test VMs with jumbo frames enabled. Each test VM is configured with a pair of virtual NICs connected to 2 SR-IOV enabled vSwitchs. I am using a NIC team enabled within the VM OS to provide resiliency. I was able to get the VMs to run fine with SR-IOV enabled on both virtual NICs on the hosts I built them on. Later I decided to live migrate the VMs to other hosts. The Live Migration itself works just fine. I noticed that when I migrated a VM from its original host to a new host for the first time that SR-IOV breaks. For some reason when the VM migrates it "resets" the "Jumbo Frames" setting from 9014 to 1514 on the Broadcom VF devices. This causes it to become out of sync with the Jumbo Frames setting on the Hyper-V Network Adapter and SR-IOV breaks. If I go in to the VM and manually adjust the Jumbo frames size back to 9014 on the Broadcom VF device then SR-IOV works fine again. If I migrate the VM to a third host the same issue occurs again. Once I have fixed the Jumbo Frames issue on for the VM when it resides on a particular host it does not reoccur on that host again if the VM is migrated away and then back. My hypothesis is that this might be some kind of PnP related issue causing it to lose the Broadcom VF during migration and then reestablish it after migration and thus creating the new VF with the default 1514 frame size. Any help or ideas would be greatly appreciated.Anonymous
April 30, 2013
Hello John, Thanks for an excellent write up. I have setup a lab with a HP 380p G8 and HP 530FLR-SPF+ based on a broadcom chip. A couple of points: When provisioning the server, I forgot to turn on SR-IOV in BIOS before installing. Resulting in 2 new NICs exposed to the host OS. A lot of wasted time configuring network settings on new adapters and remove settings on old ones still present in registry. I noticed before switching on SR-IOV in BIOS I had VMQ active on my 2008R2 guest OS. This with a switch without SR-IOV enabled. Get-Vmswitch produced a positive number on AvaiableVMQueues and AvailableIPSecSA. After enabling SR-IOV these values are 0. In my setup, SR-IOV seems to exclude these other accelerator means which is to the contrary from your output with the intel adapters. What is your take on this? After turning on SR-IOV in BIOS and creating a new switch, SR-IOV works fine on a windows 2012 guest OS. VF is present and installing «physical» NIC drivers was all it took. No VFs are exposed to my Windows 2008R2 guests. Intel says it is supported, VMware says you need Windows 2008R2 SP2. I find information that claims there has to be a Windows 2012/8 OS as a guest. If that is the case, someone should write that IN BIG LETTERS. Any Comments ? Regards, HenningAnonymous
May 05, 2013
Thanks again John, Saves me a lot of time with those spot on answers. Just a quick question: What would be the equivalent to get-vmswitch | fl on a 2008R2 Hyper-V role to see the number of VMQ's avaiable for provisioning to your guests ? Thanks, HenningAnonymous
May 14, 2013
Hello John Shortly this is what I have while configuring SR-IOV with Cisco VM-FEX on UCS C200-M2: PS C:> Get-NetAdapterSriov Name : Ethernet 5 InterfaceDescription : Cisco VIC Ethernet Interface #2 Enabled : True SriovSupport : Supported SwitchName : Default Switch Name NumVFs : 128 Same time vmhost shows: PS C:UsersAdministrator> get-vmhost | fl "Iov*" IovSupport : True IovSupportReasons : {OK} Same time vmswitch connected to adapter shows: PS C:> Get-VMSwitch SR-IOV-SW | fl Iov* IovEnabled : True IovVirtualFunctionCount : 0 IovVirtualFunctionsInUse : 0 IovQueuePairCount : 0 IovQueuePairsInUse : 0 IovSupport : False IovSupportReasons : {This network adapter does not support SR-IOV.}Anonymous
May 27, 2013
It's great but how about Linux? Broadcom's supports SR-IOV on Fedora/RedHat...? Thanks!Anonymous
June 07, 2013
John, Do you plan on supporting SR-IOV for Storage Cards as well ? Another question is can we write our own PF drivers, VF drivers for storage following the drive model example of PF Driver and VF Drivers for networking ?Anonymous
June 18, 2013
Thanks for your response to my earlier questions. Another question - If one wants to implement SR-IOV for Storage Cards, can they write their own proprietary PF & VF drivers on Windows without waiting for Microsoft support on this. It might be complex, but I want to find out if this can be done.Anonymous
September 11, 2013
HP finally releases Broadcom drivers to allow simultaneous use of VMQ/SRIOV offloading on the same switch. This is more than a year down the road of DL380p’s lifespan. ComputerName : S4 Name : HP 10G SR-IOV Id : f4e745f8-4ba3-47aa-a688-cb572192a6fc Notes : SwitchType : External AllowManagementOS : False NetAdapterInterfaceDescription : HP Ethernet 10Gb 2-port 530FLR-SFP+ Adapter #85 AvailableVMQueues : 29 NumberVmqAllocated : 7 IovEnabled : True IovVirtualFunctionCount : 16 IovVirtualFunctionsInUse : 1 IovQueuePairCount : 101 IovQueuePairsInUse : 1 AvailableIPSecSA : 0 NumberIPSecSAAllocated : 0