Deploying Windows Server 2012 Beta with SMB Direct (SMB over RDMA) and the Mellanox ConnectX-2/ConnectX-3 using InfiniBand – Step by Step
This blog post is now obsolete. For the latest post on this topic, visit https://blogs.technet.com/b/josebda/archive/2012/07/31/deploying-windows-server-2012-with-smb-direct-smb-over-rdma-and-the-mellanox-connectx-2-connectx-3-using-infiniband-step-by-step.aspx
1) Introduction
We have covered the basics of SMB Direct and some of the use cases in previous blog posts and TechNet articles.
If you haven’t seen those, here are a few pointers:
- Windows Server 2012 Beta SMB Overview
- High-Performance, Continuously Available File Share Storage for Server Applications Technical Preview
- Deploying Fast and Efficient File Servers for Server Applications
- Windows Server 2012 beta - Test cases for Hyper-V over SMB (includes PowerShell examples)
- Building Your Cloud Infrastructure: Converged Data Center with File Server Storage
However, I get a lot of questions about specifically which cards work with this new feature and how exactly you set those up.
This is one of a few blog posts that cover specific instructions for RDMA NICs.
In this blog post, we’ll cover all the details to deploy the Mellanox ConnectX-2 and ConnectX-3 adapters, using the InfiniBand “flavor” of RDMA.
2) Hardware and Software
To implement and test this technology, you will need:
- Two or more computers running Windows Server 2012 beta
- One or more Mellanox ConnectX-2 or ConnectX-3 adapters for each server
- One or more Mellanox InfiniBand switches
- Two or more cables required for InfiniBand (typically using QSFP connectors)
Mellanox states support for Windows Server 2012 SMB Direct and Kernel-mode RDMA capabilities on the following adapter models:
- Mellanox ConnectX-2. This card uses Quad Data Rate (QDR) InfiniBand at 32 Gbps data rate.
- Mellanox ConnectX-3. This card uses Fourteen Data Rate (FDR) InfiniBand at 54 Gbps data rate.
You can find more information about these adapters on Mellanox’s web site.
Important note: The older Mellanox InfiniBand adapters (including the ConnectX-1 adapters and the InfiniHost III adapters), won't work with SMB Direct in Windows Server 2012.
Here are some examples of configurations you can use to try the Windows Server 2012 Beta:
2.1) Two computers using QDR
If you want to setup a simple pair of computers to test SMB Direct, you simply need two InfiniBand cards and a back-to-back cable. This could be used for simple testing like one file server and one Hyper-V server. If you want the most affordable InfiniBand solution, you can use a single-port QDR card, which operates at 32Gbps data rate. Here are the parts you will need:
Qty
Part#
Description
Links
2
MHQH19B-XTR
ConnectX-2, Single port, QSFP connector, QDR InfiniBand
1
MC2206130-001
QSFP to QSFP cables, 1m (3ft)
2.2) Eight computers using QDR
If you want to try a more realistic configuration with InfiniBand, you could setup a two-node file server cluster connected to a six-node Hyper-V cluster. In this setup, you will need 8 computers, each with an InfiniBand card. You will also need a switch with at least 8 ports (Mellanox offers an 8-port model). Using QDR speeds, you’ll need the following parts:
Qty
Part#
Description
Links
8
MHQH19B-XTR
ConnectX-2, Single port, QSFP connector, QDR InfiniBand
8
MC2206130-001
QSFP to QSFP cables, 1m (3ft)
1
MIS5022Q-1BFR
IS5022 InfiniBand Switch, 8 ports, QSFP, QDR
2.3) Two computers using FDR
You may also try the faster FDR speeds (54Gbps data rate). The minimum setup in this case would again be two cards and a cable. Please note that the QDR and FDR cables are different, although they use similar connectors. Here’s what you will need:
Qty
Part#
Description
Links
2
MCX353A-FCBT
ConnectX-3 adapter, Single port, QSFP, FDR InfiniBand
1
MC2207130-001
QSFP to QSFP cables (FDR), 1m (3ft)
Please note that you will need a system with PCIe Gen3 slots to achieve the rated speed in this card. These slots are available on newer system like the ones equipped with an Intel Romley motherboard. If you use an older system, the card will be limited by the speed of the older PCIe Gen2 bus.
2.4) Ten computers using dual FDR cards
If you’re interested in experience great throughput in a private cloud setup, you could configure a two-node file server cluster plus an eight-node Hyper-V cluster. You could also use two InfiniBand cards for each system, for added performance and fault tolerance. In this setup, you would need 20 FDR cards and a 20-port FDR switch (Mellanox sells a model with 36 FDR ports). Here are the parts required:
Qty
Part#
Description
Links
20
MCX353A-FCBT
ConnectX-3 adapter, Single port, QSFP, FDR InfiniBand
20
MC2207130-001
QSFP to QSFP cables (FDR), 1m (3ft)
1
SX6036
InfiniBand Switch, 36 ports, QSFP, FDR
3) Download and update the firmware and driver
Windows Server 2012 Beta includes an inbox driver for the Mellanox ConnectX-2 and ConnectX-3 cards. However, Mellanox provides updated firmware and drivers for download. You should be able to use the inbox driver to access the Internet to download the updated driver.
The latest Mellanox drivers can be downloaded from: https://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=129&menu_section=34. The package is provided to you as a single executable file. Simply run the EXE file to update the firmware and driver. This package will also install Mellanox tools on the server.
· Run the setup package: C:\MLX\MLNX_VPI_win8_beta.exe (you can choose between complete or custom setup types)
· The installer will detect if you have at least one card with an old FW, then you will reach to the below dialog
Note 1: For more detailed information on how to install the package, refer to the Mellanox WinOF for Windows 8 Quick Start Guide
Note 2 : This package does not update firmware for OEM cards. If you using this type of card, contact your OEM.
Note 3: Certain Intel Romley systems won't boot Windows Server 2012 Beta when an old Mellanox firmware is present. You will need to update the firmware of the Mellanox card using another system before you can use that Mellanox card on the Intel Romley system. That issue might also be addressed in certain cases by updating the firmware/BIOS of the Intel Romley system
4) Configure a subnet manager
When using an InfiniBand network, you are required to have a subnet manager running. The best option is to use a managed InfiniBand switch (which runs a subnet manager), but you can also install a subnet manager on a computer connected to an unmanaged switch. Here are some details:
4.1) Best option – Using a managed switches with a built-in subnet manager
For this option, make sure you use managed switches. These switches come ready to run their own subnet manager and all you have to do is enable that option using the switch’s web interface. See the example below:
4.2) Using OpenSM with a single unmanaged switch
If you don’t have a managed switch, you can use one of the computers running Windows Server 2012 Beta to run your subnet manager. When you installed the Mellanox tools on step 3, you also installed the OpenSM.EXE tool, which is a subnet manager that runs on Windows Server. You want to make sure you install it as an auto-starting service.
Although the installation program configures OpenSM to run as a service, it misses the parameter to limit the log size. Here are a few commands to remove the default service and add a new one that has all the right parameters and starts automatically. Run them from a PowerShell prompt running as Administrator:
SC.EXE delete OpenSM
New-Service –Name "OpenSM" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -L 128" -DisplayName "OpenSM" –Description "OpenSM" -StartupType Automatic
Start-Service OpenSMNote 1: This assumes that you installed the tools to the default location: C:\Program Files\Mellanox\MLNX_VPI
Note 2: For fault tolerance, make sure you have two computers on your network configured to run OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.
4.3) Using OpenSM with two unmanaged switches
For complete fault tolerance, you want to have two switches and have two cards (or a dual-ported card) per computer, one going to each switch. With SMB Multichannel, you get fault tolerance in case a single card, cable or switch has a problem. However, each instance of OpenSM can only handle a single switch. In this case, you need two instances of OpenSM.EXE running on the computer, one for each card, working as a subnet manager for each of the two unmanaged switches.
In order to identify the two ports you have on the system (either on a single dual-ported card or in two single-ported cards). To do this, you need to run the IBSTAT tool from Mellanox, which will show you the identification for each InfiniBand port in your system (look for a line showing the port GUID). Here’s a sample with the two port GUIDs highlighted:
PS C:\> ibstat
CA 'ibv_device0'
CA type:
Number of ports: 2
Firmware version: 0x20009209e
Hardware version: 0xb0
Node GUID: 0x0002c903000f9956
System image GUID: 0x0002c903000f9959Port 1:
State: Active
Physical state: LinkUp
Rate: 40
Base lid: 1
LMC: 0
SM lid: 1
Capability mask: 0x90580000
Port GUID: 0x0002c903000f9957Port 2:
State: Down
Physical state: Polling
Rate: 70
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x90580000
Port GUID: 0x0002c903000f9958Once you have identified the two port GUIDs, you can run the following commands from a PowerShell prompt running as Administrator:
SC.EXE delete OpenSM
New-Service –Name "OpenSM1" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9957 -L 128" -DisplayName "OpenSM1" –Description "OpenSM for the first IB subnet" -StartupType Automatic
New-Service –Name "OpenSM2" –BinaryPathName "`"C:\Program Files\Mellanox\MLNX_VPI\IB\Tools\opensm.exe`" --service -g 0x0002c903000f9958 -L 128" -DisplayName "OpenSM2" –Description "OpenSM for the second IB subnet" -StartupType Automatic
Start-Service OpenSM1
Start-Service OpenSM2Note 1: This assumes that you installed the tools to the default location: C:\Program Files\Mellanox\MLNX_VPI
Note 2: For fault tolerance, make sure you have two computers on your network, both configured to run two instances of OpenSM. It is not recommended to run OpenSM in more than two computers connected to a switch.
5) Configure IP Addresses
After you have the drivers in place, you should configure the IP address for your NIC. If you’re using DHCP, that should automatically, so just skip to the next step.
For those doing manual configuration, assign an IP address to your interface using either the GUI or something similar to the PowerShell below. This assumes that the interface is called RDMA1, that you’re assigning the IP address 192.168.1.10 to the interface and that your DNS server is at 192.168.1.1.
Set-NetIPInterface -InterfaceAlias RDMA1 -DHCP Disabled
Remove-NetIPAddress -InterfaceAlias RDMA1 -AddressFamily IPv4 -Confirm:$false
New-NetIPAddress -InterfaceAlias RDMA1 -AddressFamily IPv4 -IPv4Address 192.168.1.10 -PrefixLength 24 -Type Unicast
Set-DnsClientServerAddress -InterfaceAlias RDMA1 -ServerAddresses 192.168.1.1Note: If your NICs are showing as "Disconnected", you're probably missing a subnet manager. See Step 4 above for details.
6) Verify everything is working
Follow the steps below to confirm everything is working as expected:
6.1: Verify network adapter configuration
Use the following PowerShell cmdlets to verify Network Direct is globally enabled and that you have NICs with the RDMA capability. Run on both the SMB server and the SMB client.
Get-NetOffloadGlobalSetting | Select NetworkDirect
Get-NetAdapter
Get-NetAdapterRDMA
6.2: Verify SMB configuration
Use the following PowerShell cmdlets to make sure SMB Multichannel is enabled, confirm the NICs are being properly recognized by SMB and that their RDMA capability is being properly identified.
On the SMB client, run the following PowerShell cmdlets:
Get-SmbClientConfiguration | Select EnableMultichannel
Get-SmbClientNetworkInterfaceOn the SMB server, run the following PowerShell cmdlets:
Get-SmbServerConfiguration | Select EnableMultichannel
Get-SmbServerNetworkInterface
netstat.exe -xan | ? {$_ -match "445"}Note: The NETSTAT command confirms if the File Server is listening on the RDMA interfaces.
6.3: Verify the SMB connection
On the SMB client, start a long-running file copy to create a lasting session with the SMB Server. While the copy is ongoing, open a PowerShell window and run the following cmdlets to verify the connection is using the right SMB dialect and that SMB Direct is working:
Get-SmbConnection
Get-SmbMultichannelConnection
netstat.exe -xan | ? {$_ -match "445"}Note: If you have no activity while you run the commands above, it’s possible you get an empty list. This is likely because your session has expired and there are no current connections.
6.4: Verify the SMB events that confirm an RDMA connection
On the SMB client, open a PowerShell window and run the following cmdlets to view the SMB events that confirm that you have an SMB Direct connection. If there are any RDMA-related connection errors, you will also see them:
Get-WinEvent -LogName Microsoft-Windows-SMBClient/Operational | ? Message -match "RDMA"
8) Conclusion
I hope this helps you with your testing of the Mellanox InfiniBand adapters. I wanted to covered all different angles to make sure you don’t miss any relevant steps. I also wanted to have enough troubleshooting guidance here to get you covered for any known issues. Let us know how was your experience with the beta by posting a comment.
Comments
- Anonymous
January 01, 2003
The Interop conference is happening this week in Las Vegas (see www.interop.com/lasvegas ) and - Anonymous
January 01, 2003
@AshThe Inifinihost III adapters are older than the ConnectX-1 adapters and they are not supported with Windows Server 2012. You need ConnectX-2 or ConnectX-3 adapters.I have updated the post to include this information more clearly. - Anonymous
January 01, 2003
@TonyRYes, you can.If using for Live Storage Migration with SMB traffic, it will use the RDMA capability of the NIC via SMB Direct.If using regular Live Migration, the traffic will use regular TCP/IP over InfiniBand (known as "IP over IB"), without RDMA. - Anonymous
January 01, 2003
@TonyRRDMA transfers will use fewer CPU cycles and provide lower latency than IPoIB (which uses the TCP/IP stack).Link speed is technically the same for IPoIB and RDMA, but actual throughput will be a factor of the specific workload and configuration. - Anonymous
April 19, 2012
can you use IB for live migration networks? - Anonymous
April 19, 2012
1st I forgot to say thanks for publishing this! 2nd does IPoIB without RDMA run at slower speeds and higher cpu utilization?thanks - Anonymous
April 20, 2012
have you tried LM via ipoib with qdr cards like the connectx-2,3? Would be curious to see the results of a two node cluster with a direct connect between nodes for the LM network.thanks - Anonymous
April 23, 2012
gr8 posting waiting for it for sometime...... - Anonymous
April 23, 2012
Are these cards supported Mellanox MHEA28-XTC Infinihost III ? do we need to buy new cards connectx-2,3