Udostępnij za pośrednictwem


How DNS Scavenging and the DHCP Lease Duration Relate

Hello everyone, Sean Ivey here from the US PFE – Carolinas team. I’m what we refer to as a platforms-AD PFE. Basically I focus on Active Directory and related networking technologies. Recently, and on three separate occasions, I worked with SCCM administrators having issues deploying the SCCM client. Specifically, they were seeing the error “Failed to get token for current process (5)” in ccm.log. We discovered the problem was related to DNS and DHCP rather than SCCM. As a matter of fact, other services were suffering from the same issue but either didn’t experience symptoms or showed slightly different symptoms. Let’s talk about the problem and discuss how it can be PREVENTED!

The Scenario

Consider the following simplified scenario.

DHCP

  • A DHCP scope has its lease duration set to the default 8 days.
  • The DHCP scope is low on available IP addresses.
  • Client-A has NOT renewed its IP address lease in 8 days, so it has expired.
  • Client-B is requesting a new IP address.
  • The DHCP server assigns Client-B the address that was leased to Client-A.

 So far so good. This is a very typical scenario and everything works as we would expect. Now let’s add DNS into this story.

DNS

  • An Active Directory integrated DNS zone is set to scavenge stale resource records.
  • The defaults are used; “No Refresh = 7 days” and “Refresh = 7 days”.
    • The server defaults are used as well; “Scavenging Period = 7 days”.
  • Client-A renewed its DNS record 8 days ago (the last time its DHCP lease was updated as well).
  • Client-A is the owner of its DNS record so it cannot be deleted by the DHCP server (by default).
  • Client-B registers its DNS record with the new IP address it received from the DHCP server.
    • This IP address is the same one that is registered to Client-A!
  • The DNS server will not be able to scavenge Client-A’s DNS record for another 6 days!

(NOTE: if you’re unsure what all of this “scavenging”, “refresh/no refresh” stuff is check out Josh Jones’ blog, it’s great!)

Uh-oh, not so good. This happens more than you’d think. Now Client-A and Client-B have the same IP address registered in DNS!

 

 

Figure 1

Ugh, now we’ve got two different names associated with the same IP address in DNS. And it will likely stay this way for at least 6 days using the defaults for DNS scavenging outlined above. What kind of problems can we expect to see? Let’s take a look.

The Problem

I mentioned this issue manifesting itself as a problem installing the SCCM client, but in reality we can demonstrate this with a much simpler example; accessing a shared folder. Ultimately the problem is the same.

So now let’s say I need access to a share on Client-A. Let’s use the administrative share as an example. Maybe a deployment requires this share to be accessible.

 

Figure 2

Well that’s interesting. First, Client-A isn’t even turned on…but we get a response. And of all things it’s a logon failure! Some of you may already realize this is what happens when we send a Kerberos ticket intended for one computer to another computer, but let’s quickly walk through this process.

We see that our computer (Infra-App1) does a DNS query for client-a.corp.contoso.com. It then gets a response back saying the IP address is 10.0.0.100.

 

Figure 3

As far as DNS is concerned, this is true. Client-A does have 10.0.0.100 listed as its IP address, but so does Client-B.

Great, so now let’s go get a Kerberos ticket. Our DNS query was for Client-A, so our TGS request will be for Client-A as well. 

 

Figure 4

 

Figure 5

We can see the request in Figure 4, and the domain controller’s response in Figure 5.

Now that we have our ticket, we try to connect to what we think is Client-A. 

 

Figure 6

We see the Kerberos ticket is being presented in this frame.

And finally, we get an error returned from Client-A. Why? Because Client-A isn’t Client-A, it’s Client-B!

 

Figure 7

All of this just reiterates what you might have guessed. For Kerberos to work you have to present the right ticket to the right account.
(NOTE: for more information on Kerberos, go read Rob Greene’s blog, “Kerberos for the Busy Admin”)

Realize this works just fine if the IP address is used instead of the FQDN. Why? Because NTLM authentication will be used instead of Kerberos authentication. With the IP address, we make no assumption about which client we’re connecting to (which is why we have to negotiate NTLM in the first place). We’re simply connecting to an IP address. Some of you might be thinking that it should work when using the FQDN as well. After all, if Kerberos fails we try NTLM right? Not quite, I won’t go into the details here, but it’s only if we fail to negotiate Kerberos that we will fall back to NTLM. You can read more about it here. Either way, in our scenario Kerberos didn’t fail. It returned a valid response.

Prevention

So now that we understand that this issue is related to stale DNS records, let’s discuss how we can prevent the problem from happening in the first place. There are a few different approaches, so let’s talk about each.

NOTE: For each of these I recommend lowering the scavenging interval to 1-3 days. The 7 day default will prolong the period invalid records remain in DNS.

  1. Increase the DHCP lease duration to match the “no-refresh + refresh” interval. In our example we would increase the DHCP lease to 14 days.
    1. Pros:
      1.  DHCP leases will remain until the DNS record is scavenged which means no other client will receive the address and register it in DNS
      2.  It’s easy.
    2. Cons:
      1.  If the DHCP scope is already low on addresses, you’ll likely run out
      2.  A small percentage of records may not be scavenged before the lease expires because of small time differences. Setting the scavenging interval to 1 day will ensure the defunct records are removed the next day.
  2. Decrease the “no-refresh + refresh” interval to match the DHCP lease. In our example we would decrease both “no-refresh” and “refresh” to 4 days.
    1. Pros:
      1.  The existing DNS record will be scavenged sooner affectively achieving the same results as in the first solution
      2.  It’s easy.
    2. Cons:
      1.  Active Directory replication will increase (if these are AD integrated DNS zones). This is because the DNS records will be refreshed by the clients more often (every 4 days instead of every 7)
      2.  A small percentage of records may not be scavenged before the lease expires because of small time differences. Setting the scavenging interval to 1 day will ensure the defunct records are removed the next day
  3.  Allow the server DHCP to register the addresses on behalf of the clients.
    1. Pros:
      1.  The DHCP server will be able to remove the DNS record as soon as the lease expires
      2.  If setup correctly no duplicate records should exist.
    2.  Cons:
      1.  The setup is more involved.
      2.  A service account will need to be setup to run the DHCP service, or all the DHCP servers will need to be joined to the DNSUpdateProxy group (less secure) adding complexity.

For steps on doing this, read this KB article (around the Use the DnsUpdateProxy security group section).

Experiment with the DHCP lease duration, and “no-refresh/refresh” intervals. You may find a need to depart completely from the defaults. Low DHCP lease durations (in the hours) are sometimes used for wireless subnets. Be mindful of the performance of your servers though, especially if you have a DNS server set to scavenge every few hours on very large DNS zones.

Identifying Records with Duplicate IPs

Almost there! Now, we understand the problem, when the problem happens, and how to prevent it. But how can we easily identify these duplicate records? You could search through DNS easily enough, but why not use PowerShell? 


#Import the Active Directory Module
import-module activedirectory

#Define an empty array to store computers with duplicate IP address registrations in DNS
$duplicate_comp = @()

#Get all computers in the current Active Directory domain along with the IPv4 address
#The IPv4 address is not a property on the computer account so a DNS lookup is performed
#The list of computers is sorted based on IPv4 address and assigned to the variable $comp
$comp = get-adcomputer -filter * -properties ipv4address | sort-object -property ipv4address

#For each computer object returned, assign just a sorted list of all
#of the IPv4 addresses for each computer to $sorted_ipv4
$sorted_ipv4 = $comp | foreach {$_.ipv4address} | sort-object

#For each computer object returned, assign just a sorted, unique list
#of all of the IPv4 addresses for each computer to $unique_ipv4
$unique_ipv4 = $comp | foreach {$_.ipv4address} | sort-object | get-unique

#compare $unique_ipv4 to $sorted_ipv4 and assign just the additional
#IPv4 addresses in $sorted_ipv4 to $duplicate_ipv4
$duplicate_ipv4 = Compare-object -referenceobject $unique_ipv4 -differenceobject $sorted_ipv4 | foreach {$_.inputobject}

#For each instance in $duplicate_ipv4 and for each instance
#in $comp, compare $duplicate_ipv4 to $comp If they are equal, assign
#the computer object to array $duplicate_comp
foreach ($duplicate_inst in $duplicate_ipv4)
{
    foreach ($comp_inst in $comp)
    {
        if (!($duplicate_inst.compareto($comp_inst.ipv4address)))
        {
            $duplicate_comp = $duplicate_comp + $comp_inst
        }
    }
}

#Pipe all of the duplicate computers to a formatted table
$duplicate_comp | ft name,ipv4address -a


Here’s a sample of the output:

 

Figure 8

This is a pretty straightforward PowerShell script. Consider it a sample. This will only return duplicate IP addresses registered to actual computer accounts in Active Directory. Keep in mind it will query every computer in an Active Directory domain and then it will do a DNS query to get the IP address. If you have many computers, use the -searchbase switch with get-adcomputer to limit the number of computers returned each time. If the computer is not joined to AD it will never be returned from get-adcomputer. This is really aimed at finding records in DNS that contain duplicate IP addresses because of the scenario listed above.

Summary

There are a number of articles and blogs that discuss this issue in some shape or form. My goal was to tie all of these separate pieces together to make the big picture a little clearer. To recap:

  • The Scenario
    • Default “no-refresh/refresh” interval coupled with the default DHCP lease = stale DNS records.
  • The Symptoms
    • SCCM “Failed to get token for current process (5)”
    • File Shares “Logon Failure: The target account name is incorrect”
    • Many, many others (potentially anything using Kerberos)
  • The Problem
    • Kerberos authentication requires the ticket be specific to a computer. Stale DNS records mean we could be sending the Kerberos ticket to the wrong computer.
  • The Fix
    • A combination of changing the “no-refresh/refresh” intervals and the DHCP lease period.
    • Configuring DHCP to register records for the clients.
    • Identifying and removing duplicate records (either waiting for scavenging or using the provided script).

I hope you find this information useful! Tuning these DHCP and DNS settings will leave your environment in a much healthier state!

-Sean Ivey

Comments

  • Anonymous
    January 01, 2003
    Great Post.

  • Anonymous
    December 28, 2011
    Good write up, with the advent of mobile computing this was a big issue for several clients.  

  • Anonymous
    February 24, 2012
    Finally an article about scavenging that is clear and detailed. This helped me get a grip on the subject. Thanks!

  • Anonymous
    March 05, 2012
    Excelent article !! I was researching this issue, had few ideas my own, but this put all together nicely

  • Anonymous
    March 22, 2012
    The comment has been removed

  • Anonymous
    September 28, 2012
    Very clear and detailed! Excellent information! We're experiencing the same issue for years. We knew where it came from but didn't analyse all remediation options as you did here beautifully. This makes the choices to make much more easier.

  • Anonymous
    February 07, 2013
    THank you very much.  It helps me a lot to solve my issue

  • Anonymous
    March 13, 2013
    Hopefully this blog is still active but here goes. We are having an issue where when computers lose their connection to the network and then come back up and their machine renews to a new IP address but DHCP still holds onto the old IP address almost exhausting the DHCP leases for that subnet? The machines are not obtaining the same IP address it had before. We are seeing some schools have an issue where the school drops power then when power is restored new IP addresses are given out and DHCP holds onto the old IP’s and the scope for that subnet’s IP scope is almost depleted of addresses. We set the renew date down to a day and restart the DHCP service and that seems to fix the issue for now. We have all our DHCP scopes housed on one 2008 server. Thank you

  • Anonymous
    October 30, 2013
    I am a bit confused by your first two prevention options.  In option 1, you suggest increasing DHCP lease duration to 14 days.  Is that leaving the no-refresh/refresh at the default 7 days?  I ask because doesn't DHCP leases attempt to renew half way through the lease duration?  So if the duration was 14 days, at 7 days, the client would attempt to renew.  Same logic in option 2, decreasing the no-refresh/refresh to 4 days due to the default DHCP lease time being 8 days.  I apologize if I am over thinking this or incorrect.  Just trying to clarify.  Thanks for your time and this article as there aren't many out there that go into such detail.  

  • Anonymous
    October 30, 2013
    The comment has been removed

  • Anonymous
    October 31, 2013
    Awesome! Thanks for taking to the time to reply.  I appreciate it along with the article!

  • Anonymous
    December 11, 2013
    Thanks for your efforts writing this article.

  • Anonymous
    December 16, 2013
    How about a dhcp leasetime at 8 hours, what vould you suggest the scavenging to be 5 hours??

  • Anonymous
    December 17, 2013
    The comment has been removed

  • Anonymous
    January 28, 2014
    Perfect!!!

  • Anonymous
    March 18, 2014
    The comment has been removed

  • Anonymous
    July 17, 2014
    As a Microsoft Premier Field Engineer I frequently get asked for more information on Active Directory topics. Most of the time I end up passing along one or more of the links in today's post. This list will be extremely valuable for anyone who wants

  • Anonymous
    July 25, 2014
    Wow thanks a lot Sean for this article, it is so clearly explained....

  • Anonymous
    September 19, 2014
    It was suggested we put our wired clients on a 365 day lease, for security reasons, but our guest clients are on a 3 day lease. I am struggling with how to set up the scavenging. Please help me understand.

  • Anonymous
    October 01, 2014
    The comment has been removed

  • Anonymous
    October 31, 2014
    The comment has been removed

  • Anonymous
    November 05, 2014
    Excellent Article.

  • Anonymous
    November 07, 2014
    WHAT IS SCAVAGING

  • Anonymous
    November 07, 2014
    WHAT IS SCAVAGING

  • Anonymous
    December 14, 2014
    Great article....

  • Anonymous
    May 09, 2015
    Great Article helped lot.

  • Anonymous
    May 26, 2015
    The comment has been removed

  • Anonymous
    June 03, 2015
    The comment has been removed

  • Anonymous
    July 01, 2015
    Bravo Sean,

    Very well put together article that is going to help clear up an issue we have been encountering.

  • Anonymous
    November 19, 2015
    Awesome Article. Thanks for this:-)

  • Anonymous
    December 02, 2015
    Adam is correct I am seeing the same result with the powershell script. solution?

  • Anonymous
    December 14, 2015
    The comment has been removed

  • Anonymous
    January 18, 2016
    This is a full solution on the scenarios described above!! Thanks so much for sharing this detailed step-by-step process!!

  • Anonymous
    January 19, 2016
    Top quality article, really well explained and gives clarity on the direction i need to go in order to save myself some more headaches in the future. My network is now operating on all cylinders again!
    Thanks!