Additional troubleshooting step when encountering PIC issues w/AOL
Summary:
America Online ("AOL") uses Global Server Load Balancing ("GSLB") to maintain their SIP gateway farm. Their configuration uses sub VIPs, one for each of their two datacenters. Whenever AOL performs maintenance (new code, hardware, etc.), they use GSLB to take one datacenter offline while they update it and point all of the SIP traffic to the secondary datacenter.
As with all GSLB systems, DNS caching on the client side can be an issue. For example, this past week (July 16th), AOL replaced some hardware at one of their sites. When doing so, they directed all traffic to their other site around 12am EDT. It takes a maximum of 30 seconds for these changes to be reflected in DNS.
Several of our Office Communications Server customers were having issues connecting via PIC to AOL (through their SIP gateways) as late as 6:30am that morning, but that the issue resolved itself in about an hour's time. This sounds very much like DNS caching at work. That is, the users began using the system at 6:30am and continued to have issues until their DNS was refreshed.
Action Item:
While it is unclear where the caching took place (local host, ISP DNS, local network, or within the OCS topology itself), a good rule of thumb is that whenever an OCS customer is unable to reach AOL's SIP gateways via PIC, the first troubleshooting step should be to initiate a DNS flush (and that it is best to use the sip.oscar.aol.com FQDN instead of a specific IP address when connecting to AOL).
Comments
Anonymous
July 25, 2009
The comment has been removedAnonymous
July 26, 2009
The comment has been removedAnonymous
July 31, 2009
Is MSN performing maintenence as well? Our MSN PIC contacts went down yesterday and still do not work. AOL and Yahoo are fineAnonymous
July 31, 2009
The comment has been removed