Checking your DNS performance isn’t delaying your O365 connections
One of the initial things which should be checked is name resolution, a point which is often forgotten when doing performance tests. If the name resolution takes time, then this will manifest itself as initial page load time slowness in SharePoint. It's less visible with Outlook but little delays in DNS, and proxy authentication etc etc when added up can mean a poorly performing O365 infrastructure.
Checking this is easy, and again involves a quick network capture on a test client
It is always advisable to flush the DNS cache by running ipconfig /flushdns before taking any traces and the steps would be:
- Install Netmon or Wireshark on a test client
- Start tracing
- Run ipconfig /flushdns to clear the DNS cache
- Start Outlook or connect to your SharePoint site
- Once connected stop the trace
- Use the filter 'DNS' to show all DNS traffic in the capture tool.
Netmon handily gives each DNS call and response (and any other protocol for that matter) a unique ID number which we can use to filter if we wish. Here it's DNS conv id 124 so I'd write conversation.DNS.ID==124 in the netmon filter to see just this DNS call and it's response. Alternatively you can right click over a frame of interest on a saved trace and click Find Conversations > DNS. Or, we could use the DNS query ID, in this case it'd be 'DNS.QueryIdentifier == 0x5b9f'
In the following example we can see the Contoso DNS servers taking up to 3.7 seconds to respond to a DNS call. This would undoubtedly manifest itself as slowness in an initial connection to Office 365.
13:52:52 16/04/2013 31.2765664 0.0000000 10.200.30.40 10.214.2.129 DNS:QueryId = 0xE41, QUERY (Standard query), Query for Contosoemeamicrosoftonlinecom-3.sharepoint.emea.microsoftonline.com of type A on class Internet
13:52:56 16/04/2013 35.0579179 3.7813515 10.214.2.129 10.200.30.40 DNS:QueryId = 0xE41, QUERY (Standard query), Response - Success, 10.123.123.124 ...
Another DNS server can be seen here responding in a slow manner
13:52:54 16/04/2013 33.3042446 0.0000000 10.200.30.40 uk1.headoffdom.uk.Contoso.com DNS:QueryId = 0xE41, QUERY (Standard query), Query for Contosoemeamicrosoftonlinecom-3.sharepoint.emea.microsoftonline.com of type A on class Internet
13:52:56 16/04/2013 35.0583415 1.7540969 uk1.headoffdom.uk.Contoso.com 10.200.30.40 DNS:QueryId = 0xE41, QUERY (Standard query), Response - Success, 10.123.123.124
However, some other queries can be seen answered by the DNS server in a much faster manner:
13:52:57 16/04/2013 35.6045648 0.0000000 10.200.30.40 10.214.2.129 DNS :QueryId = 0xE77C, QUERY (Standard query), Query for login.microsoftonline.com of type A on class Internet
13:52:57 16/04/2013 35.6049028 0.0003380 10.214.2.129 10.200.30.40 DNS:QueryId = 0xE77C, QUERY (Standard query), Response - Success, 49, 0
Under optimal conditions I would expect a return on a DNS call in less than 100ms. Ideally much less. Any delay in this phase would manifest itself as poor initial performance when loading a page. In theory (presuming we don't need to resolve any further addresses) the connectivity should be quicker once the initial page is loaded.
If you see a slow response like the one above, it's worth first checking what the psping times to the DNS server on TCP port 53 (Most calls will be over UDP but the server should be listening on TCP 53 too). The method to do this is outlined here. If the PSPING time is similar to that seen in the DNS response, then it's possibly a network delay between you and the server. If it's much quicker consistently, its more likely an application (DNS) level issue you should investigate on the server and any forwarders if used.