Disaster Recovery of Active Directory Components
Active Directory Forest Recovery
Active Directory Components
Active Directory Domain or Domains
Global Catalog
FSMO Roles
SYSVOL
Schema Partition
Configuration Partition
Application Partition
Best Practices:
Backup Active Directory database every week or at least backup once within tombstone period.
Create an isolated AD Site that is assigned to a subnet not associated with any user, workstation or server subnet. Place a domain controller from each domain on this site and set the replication interval to 7 days.
This Active Directory site may be used to authoritatively restore any accidently deleted object without restoring from backups
Since replication interval is high, there is possibility that you will be aware of accidently deleted objects before they get replicated to isolated Active Directory site.
Possible Active Directory Disasters:
Deletion of Users, Groups, Computers or Organization Unit
Corruption of Global Catalog
Lingering Objects
Corruption of SYSVOL
Corruption of Configuration / Application Partition
Corruption in DNS
Lost Domain Controller with FSMO role
Corruption while updating Schema
Consider the following Active Directory design:
Title |
Details |
Forest Root Domain |
SARVESH.LOCAL |
Child Domains |
CHILDA.SARVESH.LOCAL & CHILDB.SARVESH.LOCAL |
Root Domain Controllers |
DC01.SARVESH.LOCAL |
ChildA Domain Controllers |
DC02.ChildA.Sarvesh.local & DC03.ChildA.Sarvesh.local |
ChildB Domain Controllers |
DC04.ChildB.Sarvesh.local & DC05.ChildB.Sarvesh.local |
Deletion of Users, Groups, Computers and Organization Unit:
** **
Points to note before attempting recovery:
Is there a writable domain controller that has not received replication packets?
Was the item deleted before or after tombstone period?
When was the most recent Active Directory backup taken?
If there is a Domain Controller that has not received deletion updates then:
Stop the Inbound Replication immediately and perform Authoritative Restore.
First Execute:
Repadmin /options <DC_Name> +DISABLE_INBOUND_REPL
** **
Perform Authoritative Restore:
User Restoration:
Let us assume user Joe got deleted and it needs to be authoritatively restored from domain controller that has not received replication packets
Stop ADDS Service (for Windows 2008 and above domain controllers)
Net stop NTDS – would stop ADDS service
USN Before Restore USN After Restore
**Benefits of this approach**: This is the easiest method to update all partner servers and to recreate the object on them. There will be no loss of any attributes and group membership.
Always perform recovery on Global Catalog. Why?
We suggest that you perform restoration of users and groups on Global catalog as it would have information about the Universal, Global group membership across forest and its own Domain Local group. This would help in recovering group membership and creating LDIF file for further recovery
Reanimation of Deleted Object from Tombstone:
Explanation: When we delete an object from Active Directory, Active Directory renames the object, strips some of the attributes and set isDeleted attribute to True. Such objects remain in Deleted Objects container of domain. These objects gets deleted once they pass Tombstone lifetime and once Garbage collection process kicks in. Garbage collection process occurs every 12 hours in Active Directory Domain.
How to find out Tombstone period of the domains?
Execute
dsquery * "CN=Directory Service,CN=Windows NT,CN=Services,CN=Configuration,<DN of your domain>” –attr tombstoneLifeTime
** **
Restore Object from TombStone:
In below example, I have set some attributes for Mark like Title, Description, Manager and have added it to 3 groups: G_DomainLocal, G_Global and G_Universal. We will now delete Joe, restore it from Tombstone and confirm that these attributes and group membership will NOT get restored.
So, we have noted down some of the attributes and deleted the user from Active Directory.
Reanimate object from Tombstone
Have a look at the attributes. Description, Manager, Title and group membership have not been restored. However, SID of the object after recovery is same.
Please note that Restored Object will be in disabled state and you would need to enable the object before attempting to login.
Suggestion: It is always recommended to restore object from Backup or use DSAMAIN to mount the previous backup to see the older state of Active Directory object being restored. You may note down the group membership and add users to required groups later once object is restored.
Authoritative Restore of Organization Unit:
Scenario: We have Sales OU with three groups (Domain Local, Global and Universal) and a user account.
Let us delete the OU and restore it from backup
Perform Non-authoritative Restore of Active Directory
Authoritative Restore the OU:
Restore subtree “OU=Sales,DC=Sarvesh,DC=local”
You would notice change in USN numbers of OU and all objects in the OU, making them Authoritative in the domain and for other domain controllers in domain to receive changes
** **
** **
Perform Non-Authoritative Restore:
Assume that Active Directory object has been deleted and has replicated to all domain controllers in the domain. The object has also been removed from Tombstone or you must restore all attributes from the backup. Perform the below steps:
Restart Domain Controller in Directory Services Restore Mode and login with DSRM Password
Perform Restoration of System State to original location
At this time, do not restart the Domain Controller
Go to command prompt, and run NTDSUTIL
Type: activate instance NTDS to activate current NTDS database
Type: authoritative restore
And type restore object <DN> to restore the object
** **
Recovery of Global Catalog:
Global Catalog contains writable copy of its own domain and read only copy of other domains in the forest. It contains only few attributes that are commonly used, referred as Partial Attribute Set. Application like Exchange uses Global Catalog to identify users in forest and to resolve group membership across forest.
Think of a scenario, where you have resolved lingering objects issue across domains and Global Catalog information for other domains is corrupted. Thereby resulting in bad results to Active Directory or Exchange Server. Corrupted Global Catalog may result in email delivery failure for recipients in the local forest.
Scenario – ChildA.Sarvesh.local has two domain controllers DC02 and DC03, and ChildB.Sarvesh.local has two domain controllers DC04 and DC05. All of them are Global Catalog Servers.
DC02 goes offline and comes online after the tombstone lifetime. Thereby, it will show deleted objects of ChildA as active and enabled, and its Global Catalog will show ChildB’s deleted items as active. This would result in Active Directory inconsistency across domain controllers.
If Strict replication is enabled, then there will be good of Active Directory replication errors and if Strict Replication is disabled, it would result in lingering objects.
We now need to rebuild the Global Catalog of ChildB domain on DC02, which is a domain controller for ChildA.
Recovery:
** **
First:
Enable Strict Replication Consistency across Forest
Repadmin /regkey * +strict
Disable Outbound Replication, because we need to ensure that while we delete the Global Catalog information from DC02 to rebuild, it should not replicate this to other domain controllers in Active Directory forest.
** It is extremely critical to disable the Outbound replication, before rebuilding Global Catalog
Repadmin /options DC02 +DISABLE_OUTBOUND_REPL
Now, let us rebuild the Global Catalog:
Repadmin /rehost DC02 CHILDB.SARVESH.LOCAL DC04.CHILDB.SARVESH.LOCAL
** **
Repadmin /rehost <DC Name> <Naming Context of domain> <Good DC of domain containing writable copy>
Rehosting is used to drop the read only copy of other domain in the forest and rebuilding it from the domain controller containing writable copy of the partition.
Once rehosting is done. Enable Outbound replication.
Repadmin /options DC02 –DISABLE_OUTBOUND_REPL
It is now recommended to use LDP.exe to query LDAP and Global Catalog to verify that Lingering objects are removed from Global Catalog
** **
Seizing of FSMO Roles:
In case the Domain Controller containing any of the FSMO role is down for extended period of time and you may need to seize the role to any other domain controller in the domain or forest, follow the below steps.
Run NTDSUTIL
Type: Roles
Type: Connections
Type: Connect to server localhost
Type: Seize PDC (to seize PDC Emulator role)
Type: Seize naming master (to seize domain naming master)
Type: Seize Infrastructure master (to seize Infrastructure master role)
Type: Seize Schema master (to seize Schema master role)
Type: Seize RID master (to seize RID Master role)
What happens when original FSMO roles comes back online?
In case domain controller that previously held any of the roles comes online, please ensure that you
disable the outbound replication
Perform Inbound replication from other domain controller in the forest to ensure that it receive the changes occurred in the Active Directory and made aware of FSMO role ownership changes
Enable Outbound replication
SYSVOL Recovery:
FRS vs. DFS
Active Directory Domain that are pre-Windows 2008 Domain functional level use FRS to replicate SYSVOl. Domains that are upgraded from Windows 2003 to Windows 2008 and have updated Domain Functional Level to 2008 needs to go through FRS to DFS migration for SYSVOL contents.
Domains that are installed with Windows 2008 as Function Level, use DFS based SYSVOL replication natively.
Recovery of DFS based SYSVOL:
How to perform Non-Authoritative restore of SYSVOL from other replication partners:
Login to Domain Controller that needs to replicate fresh copy of SYSVOL contents
Launch ADSIEDIT.msc
Connect to Default Naming Context
Double Click on CN=SYSVOL Subscrription,CN=Domain System Volume,CN=DFSR-LocalSettings,CN=<DC_Name>,OU=Domain Controllers,DC=<your Domain DN>
Look for attribute msDFSR-Enabled and change the value to False
Replicate Active Directory throughout domain and wait for changes to replicate all replication partners in the domain
Run command DFSRDIAG /POLLAD
You should now see event ID 4114 mentioning SYSVOL is no longer being replicated
Change the value msDFSR-Enabled to True in ADSIEDIT for same domain controller
Force Active Directory replication and run command
DFSRDIAG /POLLAD
Now SYSVOL should start replication from other replication partners
Recovery of FRS based SYSVOL:
1. Stop the NTFRS service. Type below in command prompt
Net stop NTFRS
2. Open registry and browse following path:
HKLM\System\CurrentControlSet\Services\NTFRS\Parameters\Backup/Restore/Process at Startup
3. In the right pane, double click BurFlags to edit value to D2. Click Ok
4. Quit Registry and start NTFRS service
net start NTFRS
This would trigger replication of SYSVOL from other partners using FRS.
Note: Please ensure that Active Replication across forest is normal before fixing SYSVOL replication issue.
Please look at event logs, replication logs and directory services logs before attempting SYSVOL recovery
How to make logons and SYSVOL work?
Stop FRS replication service on all DCs.
There are two GPOs required to process logons:
Default Domain Controllers Policy {6AC1786C-016F-11D2-945F-00C04fB984F9}
Default Domain Policy {31B2F340-016D-11D2-945F-00C04FB984F9}
Copy these polices to SYSVOL share of every domain controller
How to recreate Default Group Policies?
Use DCGPOFIX utility to recreate Default Domain Policy, Default Domain Controller Policy or both
Dcgpofix /target:both
Recovery of DNS
Active Directory Integrated zones can be stored in 4 places:
Domain Partition – Sometimes referred as Legacy partition – Replicated to all domain controllers in Domain
DomainDNSPartition – Replicates to all domain controllers in domain with DNS role installed
ForestDNSPartition – Replicates to all domain controllers in forest with DNS role installed
Application Partition – Replicates to DNS Servers in the scope of directory partition
Run dnscmd /enumzones to see which Active Directory partition stores the DNS zone:
Use below command to change the Directory partition of domain
Dnscmd /zonechangedirectorypartition <zone-name> /forest | /domain | /legacy
Now the DNS Export and recovery:
Login to Domain Controller that has DNS role installed
Dnscmd <server_name> /zoneexport <dns-name> <export-file-name>
Once the Zone is exported, you may import zone as standalone DNS zone and later save it to Active Directory if required.
DNS Best Practices:
If you have multiple DNS servers for Active Directory in the site, it is recommended to use another server as Primary DNS server and own as secondary.
In case there is no other DNS In local site, then use remote site DNS server as secondary server on the domain controller
If there are multiple domains in the forest, it is advisable to push DNS suffix to all workstations instead of using WINS. This would reduce load on WINS
DNS Zones that are stored in any of Active Directory partition get restores along with Active Directory per the recovery method you choose. However, we recommend to take export dump of Active Directory Domain DNS zone to assist with during critical recovery process
Recover Application Directory Partition
Restart Server in Directory Services Restart Mode or stop NTDS service
Net stop NTDS
Run NTDSUTIL
Type activate instance NTDS
Type Authoritative Restore
Type: List NC CRs
To restore any Application partition, make a note of the Partition and corresponding Cross-Ref
Type: restore subtree <Partition information>
Restore subtree “DC=ForestDNSZones,DC=Sarvesh,DC=local”
Restore object <cross-ref>
Schema and Forest Recovery:
Changes made to Schema can’t be reversed. Schema partition can’t be authoritatively restored. Attributes that have been added to Schema can be disabled but can’t be removed.
If you have made any changes to Schema using some custom application or there is malfunction in Schema updates then restoring from backup and re-promoting all DCs is the only option left.
Best practice for Schema Updates:
Before you begin updating Schema even for well-known applications like Exchange, Lync etc., it is recommended to
disable the Outbound replication
- repadmin /options <DC_Name> +DISABLE_OUTBOUND_REPL
Verify if all changes have been as expected
Enable the outbound replication
- Repadmin /options <DC_Name> -DISABLE_OUTBOUND_REPL
Let us assume Schema changes are not successful for the DC that had outbound replication disabled. Do the following:
Keep the outbound replication disabled.
Login to another DC on the same domain.
Seize the Schema Master FSMO role
Perform metadata cleanup for the DC that had failed Schema updates
Format the DC that had failed schema updates
We suggest to keep Schema Admins group empty, add the service or administrator account to Schema Admins group when necessary. This would reduce Schema modification chances even by mistake
Recover forest from unrecoverable state:
Backups, backups and backups are extremely critical for any organization. If you have made changes to Schema that has replicated across forest or there have been other changes like corruption that are not possible to reverse.
In this case only option left is to restore every domain in the forest from backup and re-promote all other DCs in the forest. This needs to be done with extreme care as any change that are made after backup will be lost.
Please ensure that Active Directory forest backup (i.e. backup of each domain) is taken before performing any major activity and roll back steps are clearly defined and tested.
-o-