Low throughput when copying files

Hi,

I have been helping a customer with a tricky issue recently regarding slow network performance for SMB file copies over their network.

It came about after they took the settings defined in Security Compliance Manager for their member servers and deployed them as a Group Policy to their server OU. After doing this, they saw an 80% reduction in the performance in SMB file copies. But when we used Ntttcp.exe to test the network throughput via a test data stream, the throughput was not affected. Only SMB was affected.

They had Windows Server 2008 R2 SP1 VMs on ESX with 1 virtual 10Gb NIC patched to a team of 2 physical 10Gb NICs. When 2 servers tried to copy a set of large test files without the SCM security settings applies, they could reach around 400Mbps. When we applied the settings, that dropped to around 80Mbps

In the SCM security definitions, there are 234 settings defined. We had to find out which one of these settings caused their issue.

image

We could see that the CPUs of the VM were going nuts with a wild saw-tooth pattern of all CPUs. We tried adding more CPUs and the saw-tooth pattern simply spread without making any major change in achievable throughput.

The process consuming the CPU time in Task Manager was ‘System’.

So, to break into ‘System’ a little more, we ran Windows Performance Recorder (WPR) to get a trace of CPU activity, like this:

image

And in the trace, we expanded out “CPU Usage (Sampled)”, and added the graph for “DPC and ISR by Module, Stack”:

DPC and ISR by Module, Stack

This showed us that all our CPU time was spent processing DPCs generated by a driver called cng.sys

image

This is “Kernel Cryptography, Next Generation” which relates to the server or clients ability to calculate cryptographic equations in the Kernel when doing things like sending or receiving encrypted information, or information which has been signed. Signing in this case could be creating a signature hash for chunks of transmitted data to prove that is hasn’t been modified while on the wire.

This, combined with the fact that only SMB was affected lead us to think it was SMB signing that was our issue.

SMBv2 uses these 2 GPO settings to define SMB signing:

image

  1. Microsoft Network Server: Digitally sign communications (always)
  2. Microsoft Network Client: Digitally sign communications (always)

The settings relate to SMBv2. Note that they change the default, in-box setting from “Disabled” to the Microsoft recommended SCM setting of “Enabled”.

For SMBv1 on Windows 2003 and older, the GPO settings are:

  1. Microsoft Network Server: Digitally sign communications (if client agrees)
  2. Microsoft Network Client: Digitally sign communications (if server agrees)

Once we removed the “always” settings, the transfer speed returned back to the higher 400Mbps transfer speed we expected.

We discussed the usefulness of this setting and in their network, it would be best to keep the “server” side setting enabled on DCs only to ensure that the GPO files which clients will download from the DCs during a Group Policy refresh have not been altered as these files are security sensitive files, but are usually very small and we don’t mind slightly slower transfer speeds for these files.

 

Here’s some additional resources we used when investigating SMB signing:

https://blogs.technet.com/b/josebda/archive/2010/12/01/the-basics-of-smb-signing-covering-both-smb1-and-smb2.aspx

https://msdn.microsoft.com/en-us/library/a64e55aa-1152-48e4-8206-edd96444e7f7#id218

https://blogs.msdn.com/b/openspecification/archive/2009/07/06/negtokeninit2.aspx?Redirected=true

https://blogs.msdn.com/b/openspecification/archive/2009/04/10/smb-maximum-transmit-buffer-size-and-performance-tuning.aspx

https://blogs.technet.com/b/filecab/archive/2012/05/03/smb-3-security-enhancements-in-windows-server-2012.aspx

https://support.microsoft.com/kb/320829

https://blogs.technet.com/b/neilcar/archive/2004/10/26/247903.aspx

https://gallery.technet.microsoft.com/NTttcp-Version-528-Now-f8b12769

Comments

  • Anonymous
    February 05, 2015
    Craig, thanks for your detailed post on this matter, we just found a similar issue in one of our client's enviroment. We had a new DC on a new Dell VRTX running ESXi 5.5 that was getting 44mb/s transfer (on a 1000mb connection). New VMs on the same host had no issue with transfer speeds and we eventually found the same setting (after pulling out hair for a few days).