Getting the most out of the redundancy native to AD when making applications "AD Aware"
Many customers ask how they can best configure applications so that the applications can take full advantage of the fault tolerance built into Active Directory (AD). While there is no one right answer to this question, there are several common strategies that are frequently used. However, these strategies are not without their own shortcomings and thus deserve some discussion around the shortcomings of each of these strategies.
To set the context, in all strategies that must be employed the application developer (yes we are talking about the other guy/gal, and not the AD guy) must handle the following scenarios in some fashion or another within their code:
- Server inaccessible - Whether the server isn't online at all, or it goes down at some point after the connection was established
- Concurrency - Since AD is loosely convergent, it may take several seconds to several hours (depending on the replication interval) for the data to replicate from one DC to another. If there is the need to read the data immediately after it is written, or ensure consistency between multiple applications for any reason, all sensitive operations should occur on one box.
Pointing all LDAP enabled applications to a DNS Alias - i.e. "activedirectory.contoso.com"
- Pros
- Easy for the developers to grasp and use. Also a very low cost from the infrastructure perspective
- Cons
- Breaks Kerberos - To use Kerberos to authenticate against LDAP, the Service Principal Name of "LDAP/requestedserver.contoso.com" is queried. In this case, "LDAP/activedirectory.contoso.com" would be searched for and would not be found. Kerberos authentication thus fails and the application then tries NTLM. While NTLM will work, it is well known that NTLM is less secure than Kerberos and we thus should avoid unless absolutely necessary.
- Enabling Kerberos by registering the ServicePrincipalNames "LDAP/activedirectory.contoso.com" and "LDAP/activedirectory" on all DCs is not the best way to fix this. The reasons not to are a Kerberos discussion, and are out of scope for this conversation.
- Costly to setup and maintain from a labor perspective. Every time a DC is added to or removed from the environment, this must be updated. Also, if a DC is taken down for an extended period, this DNS record should be cleaned up.
- Breaks concurrency since there is no guarantee that any two applications that require consistency of the data will communicate with the same box.
- Not site aware. Depending on the administrators' configuration of the alias, the LDAP searches may traverse a WAN link.
- Does not distinguish between Global Catalog and non-Global Catalog Domain Controllers.
- Unpredictable selection of DCs
Using the FQDN of the domain (i.e. contoso.com):
- Pros:
- Easy for the developers to grasp and use. Also a very low cost from the infrastructure perspective
- DNS 'A' records are automatically maintained by the Domain Controllers and are registered by the NETLOGON service
- Cons:
- Not site aware. All DCs register here (unless otherwise tuned) reference https://support.microsoft.com/kb/258213)
- Does not distinguish between Global Catalog and non-Global Catalog Domain Controllers
Using the FQDN of the domain to locate Global Catalogs (i.e. gc._msdcs.contoso.com):
All the same concerns relating to the FQDN of the domain are relevant except that this record distinguishes a list of GCs.
Using site specific SRV records:
_ldap._tcp.SITENAME._sites.dc._msdcs.contoso.com
_ldap._tcp.SITENAME._sites.gc._msdcs.contoso.com
- Pros:
- Ensures a DC or GC is located near the calling application.
- DNS 'SRV' records are automatically maintained by the Domain Controllers and are registered by the NETLOGON service
- Cons:
- Requires more code. Since this returns SRV type records, name resolution must be done separately and each record returned must be attempted individually to accommodate a system that might not be online at any point in time.
- Accuracy is dependant on the efficiency of the AD site design. However, this will affect clients above and beyond the current application
Using non-site specific SRV records:
_ldap._tcp.dc._msdcs.contoso.com
_ldap._tcp.gc._msdcs.contoso.com
- Pros:
- DNS 'SRV' records are automatically maintained by the Domain Controllers and are registered by the NETLOGON service
- Cons:
- See "Using site specific SRV records"
- Not site specific.
Using DsGetDomainControllerInfo:
- Pros:
- Provides extensive detail about the Domain Controller
- Cons
- Requires more code. Since this returns a list of servers, name resolution must be done separately and each record returned must be attempted individually to accommodate a system that might not be online at any point in time.
- Can accommodate site awareness, since the site the DC is in is returned. However, this site awareness must be implemented in code.
Hard coding to a specific DC:
- Pros:
- Predictable
- Cons:
- Requires specific knowledge of the AD environment.
- Should be a configuration option of the application. We all know how many problems we can run into if we are hard coding values inside of an application and have to change them later.
- Need to figure out a strategy to keep the application on line when the server goes down.