Intermittent interrupt issues of ISA Server network load balancing
Last week, I encountered a strange issue for network intermittent interrupt.
My customer deployed two ISA Servers as a NLB array, but the network traffic to NLB array is intermittently interrupted periodically after a few minutes.
For example, if I pinged the NLB virtual address or NLB node dedicated address continuously, the output was:
Pinging 10.10.9.2 with 32 bytes of data:
Reply from 10.10.9.12: bytes=32 time<1ms TTL=127
...
Reply from 10.10.9.12: bytes=32 time<1ms TTL=127
Reply from 10.10.9.12: bytes=32 time=1ms TTL=127
Reply from 10.10.9.12: bytes=32 time<149ms TTL=127
Reply from 10.10.9.12: bytes=32 time<241ms TTL=127
Reply from 10.10.9.12: bytes=32 time<213ms TTL=127
Reply from 10.10.9.12: bytes=32 time<234ms TTL=127
Request timed out.
Reply from 10.10.9.12: bytes=32 time<1ms TTL=127
Reply from 10.10.9.12: bytes=32 time=1ms TTL=127
...
Reply from 10.10.9.12: bytes=32 time<1ms TTL=127
Reply from 10.10.9.12: bytes=32 time<156ms TTL=127
Reply from 10.10.9.12: bytes=32 time<202ms TTL=127
Reply from 10.10.9.12: bytes=32 time<212ms TTL=127
Reply from 10.10.9.12: bytes=32 time<190ms TTL=127
Request timed out.
Reply from 10.10.9.12: bytes=32 time=1ms TTL=127
Reply from 10.10.9.12: bytes=32 time<1ms TTL=127
...
After analyzing captured packets, it seems sometimes there is a big delay and timeout from client to NLB array.
If I disable NLB, everything is okay.
This is strange, and my troubleshooting steps included:
- Disabled EnableTCPChimney, EnableRSS, EnableTCPA and DisableTaskOffload in Windows Server 2003, but this didn’t help;
- The NIC mode is “Broadcom BCM5708C NetXtreme II Gige”, and I updated its driver to newest one, but it didn’t help;
- Disabled all offload functions in NIC driver’s advanced features, but it didn’t help;
- Reviewed ISA Server’s configuration - everything is properly configured.
- Asked network devices vendor to review switch’s configuration, but the switch’s configuration seems okay.
This was starting to make me crazy and I thought I didn’t miss anything.
Finally, I tried to disable all of the advanced features in NIC driver configuration, beside all offload functions, I also disabled Flow Control, Ethernet@WireSpeed, Interrupt Moderation.
After that, everything was okay!!!
This is just a hint for you. Happy to help.
Meibo Zhang, Premier Field Engineer
Comments
- Anonymous
June 27, 2010
The comment has been removed - Anonymous
June 27, 2010
Thanks a lot for your notice. I have no chance to test which option is the root cause in that time, and I will pay attention to it in future.