Share via


Configuration of WATM (Windows Azure Traffic Manager) for Web Portals hosted on Azure VMs


Introduction

Microsoft Azure is one of the most used platforms today to host Web Portals on VMs as IaaS or PaaS. For IaaS implementations, having an efficient traffic manager for load balancing and / or failovering is one of the keys for their success. Microsoft Azure provides WATM (Windows Azure Traffic Manager) as traffic manager and this Wiki article describes how it can be used.

How is WATM working?

WATM allows the management of traffic for Web portals by introducing the load balancing and failover capabilities.

We can divide WATM in two blocks in the way it works:

  • How  WATM redirects the incoming requests to Web servers
  • How WATM monitors the health status of Web servers

How WATM redirects the incoming requests to Web servers?

WATM is a Load Balancer that is capable to load balance the traffic using three (3) possible methods:

  • Performance: This method allows connecting the clients to the “closest” Web server in terms of the lowest latency. This is determined by a network performance table showing the round trip time between various IP addresses and each Azure datacenter. This method improves the user experience by reducing the latency when Azure Web servers for the portal are hosted in different datacenters.
  • Round Robin: This method does load balancing using round-robin fashion. WATM redirects the clients randomly to healthy Web servers.
  • Failover: This method allows configuring a failover priority list. The first healthy server in the list is the one that will be used and the next healthy one will be used instead if it fails. The Load will not be Load Balanced across the Web servers as a primary/backup servers principle is used

When a new WATM is provisioned, the Microsoft Azure administrator will specify a DNS name that will be used to query it (Example: contosolb01.trafficmanager.net). The clients will then query this DNS record (Directly or via a DNS alias) to access the Web portal.

Below is what happens when a client would like to browse a Web portal via WATM (We suppose that contosolb01.trafficmanager.net is the DNS name of our WATM):

  • The client asks for DNS resolution of contosolb01.trafficmanager.net

  • WATM receives the incoming request. It identifies the healthy Web server to be used and answers the client with the public IP address of this server

  • The client queries directly the Web server after it gets its IP

An important point to remember is that WATM does not act as a Reverse-Proxy as it just provides the client with IP address of the Web Server to query it directly. When the DNS resolution is done, the client will keep the record cached locally for a specific period (TTL). When the DNS cached record expires and the clients would like to access the Web portal, it will query again WATM for DNS resolution (Even if the client is not able to reach the Website, it will wait until the DNS record cache expires before asking for a new DNS resolution).

How WATM monitors the health status of Web servers?

WATM checks the health status of the Web servers every 30 seconds. It checks the response of the Web servers by sending GET queries: If WATM does not receive a response in 10 seconds or less, it performs three more tries at 30 second intervals. If no answer was received then the Web server will be set in degraded mode – It takes then around 1.5 minutes for WATM to detect that the Web server is no longer available.

Using this monitoring method, WATM is capable to identify the healthy state of the servers and dynamically update their status.

Configuration of WATM

To configure WATM, you can proceed like the following:

  • Go to Azure management Website, click on TRAFFIC MANAGER and then click on CREATE A TRAFFIC MANAGER PROFILE

  • Specify the DNS name of the WATM, select the load balancing method then continue

  • Click on the traffic manager profile

  • Click on ENDPOINTS

  • Click on ADD ENDPOINTS

  • Select Cloud Service as service type and then select the Web servers and continue

  • Click on Configure and then configure the following:
    • DNS TIME TO LIVE (TTL): This is the TTL of the DNS record that the WATM will set when it responds the clients. The clients will then keep the DNS record cached for TTL period and will ask for a new DNS resolution when it expires and they need to reach the Web portal
    • LOAD BALANCING METHOD: This is the load balancing method that WATM uses. You will find the method you have previously set configured but you can change it anytime
    • PROTOCOL: This is the protocol to use monitor the Web portal on the servers. You can choose HTTP or HTTPS
    • PORT: This is the port to use when querying and monitoring the Web servers
    • RELATIVE PATH AND FILE NAME: This is the relative path and file name you can set for monitoring the Web portal on the servers (Example: /iisstart.htm)

Remark: The Web servers should be listening and responding on the port mentioned previously so that they will not be seen in degraded state.

WATM Load Balancing behavior testing

We have done some testing of WATM behavior when this Wiki article was written.
The test environment was the following:

  • Two Windows Server 2012 R2 IIS servers hosted in Microsoft Azure North Europe Datacenters
  • One  Windows Server 2012 R2 IIS server hosted in Microsoft Azure East Asia Datacenter

Below are the details:

Server

Location

Public IP address

Contosovm001.cloudapp.net

North Europe

23.100.62.245

Contosovm002.cloudapp.net

North Europe

23.100.62.246

Contosovm003.cloudapp.net

East Asia

23.100.92.6

We have tested the different Load Balancing methods and below were the results:

  • Performance method:

We have configured Performance method as load balancing method and we used https://www.whatsmydns.net Website to check the DNS resolution from multiple locations in the globe.
Below is a screen capture of the results we received:

As you can see, depending of the location, different IP addresses were provided. This is because this load balancing method provides the “closest” server to the client.

  • Round Robin method:

We have configured Round Robin method as load balancing method and we used https://www.whatsmydns.net Website to check the DNS resolution from multiple locations in the globe.
Below is a screen capture of the results we received:

As you can see, the addresses were provided randomly by WATM.

  • Failover method:

We have configured Failover method as load balancing method and we used https://www.whatsmydns.net Website to check the DNS resolution from multiple locations in the globe.
Below is a screen capture of the results we received:

As you can see, only one IP address was provided by WATM for all the locations.

We have seen that, for all the load balancing methods in use, causing a non-availability of one of the active Web servers causes WATM to mark it as degraded within around 1.5 minute.

WATM and sticky sessions

WATM does not manage sticky sessions and that would be problematic for Web applications requiring this feature. However, as a workaround, the failover WATM mode can be used so that all the clients will go to the same server. Unfortunately, this might not be the best solution for Web portals as the load will not be balanced between all available servers.

WATM status types

The following are the status types for WATM:

Status Types

Description

Online

WATM is enabled and all endpoints are responding as expected

Degraded

One or more endpoints configured on WATM are not responding as expected

Inactive

WATM is enabled but with no configured endpoints

Disabled

WATM is disabled

Troubleshooting

To troubleshoot the reason behind having Web servers in degraded state, you can download and use wget tool (http://gnuwin32.sourceforge.net/packages/wget.htm). It will allow you to send GET requests and see the answers from the Webportals.

Conclusion

WATM (Windows Azure Traffic Manager) provides advanced features to locate the closest Azure Datacenter to end users and reduce the latency when browsing Web portals. This Wiki article shared the available configuration details and the different load balancing methods that can be used. It also shared the results of a testing scenario providing a clear view of what is expected behind the load balancing provided by the feature.