Jaa


How to analyze a trace taken using NETSH TRACE

I wrote article "Capture a NETSH network trace" here, where I discussed how to capture a NETSH trace, I will discuss how I analyzed it now.

I wrote another here that explains how to convert the ETL into a CAP file so it can be analyzed in Wireshark or Network Monitor.  "Analyze NETSH traces with Wireshark or Network Monitor, convert ETL to CAB"

Here is also a good article on analyzing the trace –> Introduction to Network Trace Analysis Using Microsoft Message Analyzer.

Here are the code snippets I am tracing, I know they are invalid and will not work, which makes it easier to spoke, the the symptoms would be similar in this context.

 
using (TcpClient client = new TcpClient())
{
  try
  {
    client.Connect("123.123.123.112", 80);
    client.Close();
  }
  catch (Exception ex)
  {
    //do something with the ex.message and stack trace  
  }
}

using (var client = new HttpClient())
{
  try
  {
    var result = await client.GetStringAsync("https://www.contoso.com");
  }
  catch (Exception ex)
  {
    //do something with the ex.message and stack trace
  }
}

The first failures I expect to see in my NETSH trace happens when I try to make a TCP connection to an invalid or unavailable IP address.  When I look in Wireshark I see Figure 1.

image

Figure 1, Wireshark, netsh trace, TCP

I know that in my code I actually loop 5 times and therefore see that 5 TCP connection starts, the GREEN lines in Figure 1.  Then after each of the GREEN lines I see 2 attempted retransmissions, then it fails out and we conclude the resource at the provided IP is not available or accessible.  I can see a similar pattern in Message Analyzer, Figure 2.

image

Figure 2, Message analyzer, netsh trace, TCP

Next for the HTTP Client calls I see Figure 3 in Wireshark.

image

Figure 3, Wireshark, netsh trace, HTTP/DNS

The reason is there in the Info column that the DNS lookup resulted in ‘No such name’.  In Message Analyzer, I had to compare a good DNS lookup with the bad on and compare the differences,  What I found and was happy to learn something on this one, Figure 4, is that there are codes called RCode and NXDomain codes.  You can read about RFC 1035 here and search for RCODE and you will find out that RCode=3 and NXDomain(3) means:  “Name Error - Meaningful only for responses from an authoritative name server, this code signifies that the domain name referenced in the query does not exist.” and that makes 100% sense.

image

Figure 4, Wireshark, netsh trace, HTTP/DNS

It looks like, same like debugging code with WinDbg that the analysis has some sense of interpretation and the root causes do not always just right out.  However, they can at least tell you that there is or is not an issue with the network communications.  Both of the above example are network failures but they are caused by my code calling invalid URLs and IPs.  Perhaps in the future, now that I have added this skill to my repertoire I will use it more and be able to match more patterns and share more on this topic.