Where’s my file? Root cause analysis of FRS and DFSR data deletion
Hi, Ned here. In the Directory Services support space here at Microsoft, we are often contacted by customers for disaster recovery scenarios. We’re also brought in for deeper forensic analysis of what lead to a problem. Today we’re going to talk about a situation that covers both:
- A customer has seen some critical data go missing.
- That data was replicated via the File Replication Service (FRS) or the Distributed File System Replication (DFSR) Service.
- Before they restore the data with their backup copy, they want to have root cause on who deleted what and where it started. We can’t do this after restoring data because our whole audit trail will of course be destroyed within the respective JET databases.
FRS Deletion Forensics – The Where and When
You need to start by determining the name of some folder or file that has been deleted. It's important that this be exact as we will be using it to search. You will need the full original path since it is possible just the name could be duplicated throughout the content set.
- For this example we have three servers called 2003SRV13, 2003SRV16, and 2003SRV17.
- We have a folder called c:\frstestlink\importantfolder13 that has been deleted.
- It contained a file called c:\frstestlink\importantfolder13\importantfile13.doc which was deleted (naturally). Our folder could contain thousands of files but we just need to know one. That’s easy, someone is screaming at you that it’s missing. :)
Install FRSDIAG on any server that participated in the FRS content set where data was deleted.
Open a CMD prompt and navigate to the FRSDIAG directory. This will default to:
c:\program files\windows resource kit\tools\frsdiag
You’ll see that we have a very useful utility called NTFRSUTL.EXE. Running it with /? will show you its options:
ntfrsutl [idtable| configtable | inlog | outlog] [computer] = enumerate the service's idtable/configtable/inlog/outlog
computer = talk to the NtFrs service on this machine.
ntfrsutl [memory|threads|stage] [computer]
= list the service's memory usage
computer = talk to the NtFrs service on this machine.
ntfrsutl ds [computer]
= list the service's view of the DS
computer = talk to the NtFrs service on this machine.
ntfrsutl sets [computer]
= list the active replica sets
computer = talk to the NtFrs service on this machine.
ntfrsutl version [computer]
= list the api and service versions
computer = talk to the NtFrs service on this machine.
ntfrsutl forcerepl [computer] /r SetName /p DnsName
= Force FRS to start a replication cycle ignoring the schedule.
= Specify the SetName and DnsName.
computer = talk to the NtFrs service on this machine.
SetName = Name of the replica set.
DnsName = DNS name of the inbound partner to force repl from.
ntfrsutl poll [/quickly[=[N]]] [/slowly[=[N]]] [/now] [computer]
= list the current polling intervals.
now = Poll now.
quickly = Poll quickly until stable configuration retrieved.
quickly= = Poll quickly every default minutes.
quickly=N = Poll quickly every N minutes.
slowly = Poll slowly until stable configuration retrieved.
slowly= = Poll slowly every default minutes.
slowly=N = Poll slowly every N minutes.
computer = talk to the NtFrs service on this machine.
Cool stuff – you can force replication, list out various FRS configuration info, or get performance stats. It’s great for scripting.
In order to figure what happened, we will dump out some tables from the FRS JET database by executing:
NTFRSUTL OUTLOG > outlog.txt
NTFRSUTL IDTABLE > idtable.txt
(Note: the output from NTFRSUTL IDTABLE is not the same as collecting IDTABLE information with FRSDIAG’s GUI console).
We then start FRSDIAG and click the 'Browse' button. Drop down the Replica Set and select the one that contained deleted data. Click 'Add All' to add the members. Then click ok.
Click 'Tools' then select 'Build GUID2Name for Target Server(s) '.
This will create us a text file that lists out GUID and its associated SERVER NAME, like so:
======================================================
Replica Set GUID : 6f83352f-f404-4eda-a714ae1691e3e9d8
Replica Set Name : FRSTEST|FRSTESTLINK
======================================================
GUID MEMBER NAME SERVER NAME
---- ----------- -----------
30409f5d-8493-41ad-a98ab03fc1b795e5 {6AEC89F5-24B5-4C1B-B15E-6EFE5A60B75C} 2003srv16 e8219dee-532a-4dff-83f09f036e331daa {6F617C11-2997-4134-952B-5B3572D4AF70} 2003srv17
e8feaedc-6bce-41f4-94c39cade3932da8 {88600E1E-AAF8-4C70-A1C3-A36EB471E2B3} 2003srv13
We open the IDTABLE.TXT file and search for our file we know was deleted:
Table Type: ID Table for FRSTEST|FRSTESTLINK (1)
FileGuid : 3647d318-502f-11dc-a1070003ff6813c5
FileID : 00130000 00003e45
ParentGuid : 2d7a8327-7308-4464-a63e367e39c27690 << Folder it was in
ParentFileID : 000d0000 00003e37
VersionNumber : 00000001
EventTime : Wed Aug 22, 2007 11:57:26 << when deleted
OriginatorGuid : 30409f5d-8493-41ad-a98ab03fc1b795e5 << where deleted
OriginatorVSN : 01c7e4c3 ea6dd496
CurrentFileUsn : 00000000 001be408
FileCreateTime :
FileWriteTime :
FileSize : 00000000 000000a0
FileObjID : 3647d318-502f-11dc-a1070003ff6813c5
FileName : importantfile13.doc << here's our file
FileIsDir : 00000000
FileAttributes : 00000020 Flags [ARCHIVE ]
Flags : 00000001 Flags [DELETED ] << Proof of deletion
ReplEnabled : 00000001
TombStoneGC : Sun Oct 21, 2007 11:57:26
OutLogSeqNum : 00000000 00000000
Spare1Ull : 00000000 00000001
MD5CheckSum : MD5: a41eea20 979f04e9 dff7592a e8dc3e8b
RetryCount : 0
FirstTryTime :
We then look in the OUTLOG.TXT to confirm the folder matches up using ParentGuid above:
Table Type: Outbound Log Table for FRSTEST|FRSTESTLINK (1)
SequenceNumber : 0000008d
Flags : 01000024 Flags [Content LclCo CmpresStage ]
IFlags : 00000001 Flags [IFlagVVRetireExec ]
State : 00000014 CO STATE: IBCO_OUTBOUND_REQUEST
ContentCmd : 00002000 Flags [RenNew ]
Lcmd : 0000000f D/F 1 NoCmd
FileAttributes : 00000010 Flags [DIRECTORY ]
FileVersionNumber : 00000001
PartnerAckSeqNumber : 00000000
FileSize : 00000000 00000000
FileOffset : 00000000 00000000
FrsVsn : 01c7e44b df699803
FileUsn : 00000000 001bd690
JrnlUsn : 00000000 001bd690
JrnlFirstUsn : 00000000 001bd690
OriginalReplica : 1 [???]
NewReplica : 1 [???]
ChangeOrderGuid : cf3bc76b-b3e1-4e72-ae8962371bb48501
OriginatorGuid : e8feaedc-6bce-41f4-94c39cade3932da8
FileGuid : 2d7a8327-7308-4464-a63e367e39c27690 << there’s our GUID
OldParentGuid : 53605485-4dd4-4b9a-bc5a022760515559
NewParentGuid : 53605485-4dd4-4b9a-bc5a022760515559
CxtionGuid : 92f5d906-cd45-4639-973021461e454c8b
Spare1Ull :
MD5CheckSum : MD5: b68d5ccf 21f8b5dd e7eb48f1 f45b01d9
RetryCount : 0
FirstTryTime : Wed Aug 22, 2007 11:55:21
EventTime : Wed Aug 22, 2007 11:55:18
FileNameLength : 34
FileName : ImportantFolder13 << definitely our folder that was deleted
Cxtion Name : <Jrnl Cxtion> <- <Jrnl Cxtion>\<Jrnl Cxtion>
Cxtion State : Joined
Then we can verify back in the IDTABLE.TXT what the time and source were:
Table Type: ID Table for FRSTEST|FRSTESTLINK (1)
FileGuid : 2d7a8327-7308-4464-a63e367e39c27690
FileID : 000d0000 00003e37
ParentGuid : 53605485-4dd4-4b9a-bc5a022760515559
ParentFileID : 00030000 000039bc
VersionNumber : 00000002
EventTime : Wed Aug 22, 2007 11:57:23 << there's the delete time
OriginatorGuid : 30409f5d-8493-41ad-a98ab03fc1b795e5 << here's the source of the delete
OriginatorVSN : 01c7e4c3 ea6dd495
CurrentFileUsn : 00000000 001beba8
FileCreateTime :
FileWriteTime :
FileSize : 00000000 00000000
FileObjID : 2d7a8327-7308-4464-a63e367e39c27690
FileName : Dc9
FileIsDir : 00000001
FileAttributes : 00000010 Flags [DIRECTORY ]
Flags : 00000001 Flags [DELETED ] << confirmed that it's been deleted
ReplEnabled : 00000001
TombStoneGC : Sun Oct 21, 2007 11:57:23
OutLogSeqNum : 00000000 00000000
Spare1Ull : 00000000 00000000
MD5CheckSum : MD5: b68d5ccf 21f8b5dd e7eb48f1 f45b01d9
RetryCount : 0
FirstTryTime :
===
Now we look back at the GUID2Name table we generated earlier, we can see that:
30409f5d-8493-41ad-a98ab03fc1b795e5 = 2003SRV16.
So we know that at Wed Aug 22, 2007 11:57:23 AM on server 2003SRV16, something or someone deleted all this data. Wasn’t that easy? :) If we had object access auditing enabled on that server at the time and the folder configured for auditing, we can even see who did it. More on this later…
DFSR Deletion Forensics – The Where and When
Since DFSR exposes nearly all its interfaces through WMI, we can use a powerful command-line utility called WMIC that can be used to return useful info from the databases. This way we don’t need to rely on add-on tools and debug logs and such. For my example below I am intentionally not using VBScript as I want everyone to understand exactly what it is we’re doing – but feel free to script it up, all the WMI classes are well documented on MSDN.
So here we go again:
- We have our three servers 2003SRV13, 2003SRV16, and 2003SRV17.
- All are in a Replication Group called ImportantData.
- They have a Replicated Folder called… wait for it… ReplicatedFolder.
- That folder contains various files and folders, including a folder called ImportantSubFolder. It contains some files, including one called critical.doc. Naturally, someone has deleted critical.doc… let’s figure out where and when.
First we open a CMD prompt as an admin and dump the Replicated Folder info like so:
C:\>wmic /namespace:\\root\microsoftdfs path DfsrReplicatedFolderInfo get ReplicatedFolderGuid,ReplicatedFolderName,ReplicationGroupName > rfinfo.txt
This returns the following into our rfinfo.txt file:
ReplicatedFolderGuid ReplicatedFolderName ReplicationGroupName
8722EF11-6466-4472-888F-11B8A57B68A4 replicatedfolder importantdata
Now we have enough info to confirm we're looking at the right data. Let's get the status of the deleted file by running this command and providing it the file name and the ReplicatedFolderGuid from above:
C:\>wmic /namespace:\\root\microsoftdfs path DfsrIdRecordInfo WHERE (filename= 'critical.doc' and replicatedfolderguid='8722EF11-6466-4472-888F-11B8A57B68A4' ) get filename,flags,updatetime,GVsn > file.txt
Our output file.txt contains:
FileName Flags GVsn UpdateTime
critical.doc 4 {BAA4E6D9-BF1A-4C83-ADF4-FDFD481AE2FC} -v113867 20070823233010.774625-000
We can see that it was deleted on Aug 23, 2007 at 23:30:10 (11:30 PM) GMT time. For us this means 7:30PM EDT.
The Flags value of 4 tells us it's been deleted; examine the chart below. An ordinary replicated file will have a Flags value of 5 (meaning 0x1 && 0x4 for Present and Replicated). If 4 the Present flag has been removed meaning the file is tombstoned, i.e. removed from the replica.
Value | Meaning |
PRESENT_FLAG0x1 | The resource is not a tombstone; it is available on the computer. |
NAME_CONFLICT_FLAG0x2 | The tombstone was generated because of a name conflict. This flag is meaningful only for tombstones. |
UID_VISIBLE_FLAG0x4 | The ID record has already been sent out to other partners; therefore, other partners are aware of this resource. |
JOURNAL_WRAP_FLAG0x10 | The volume has had a journal wrap and the resource has not been checked to determine if there is any change by the journal wrap recovery process. |
PENDING_TOMBSTONE_FLAG0x20 | The ID record is in the process of being tombstoned (or deleted.) |
The GVsn value is important in that the GUID inside those curly brackets will always contain the unique database GUID of the server where the file was last changed. Since deleting the file counts as a change, now we just need to figure out who owns that GUID. So let's use the DFSRDIAG command to find the culprit:
C:\>dfsrdiag guid2name /guid:BAA4E6D9-BF1A-4C83-ADF4-FDFD481AE2FC /rgname:importantdata
Which returns:
Object Type : DfsrVolumeInfo
Computer : 2003SRV16.fabrikam.com << The Server where the delete occurred
Volume Guid : 346CA491-54BA-11DB-91ED-806E6F6E6963
Volume Path : C:
Volume SN : 1826913329
DB Guid : BAA4E6D9-BF1A-4C83-ADF4-FDFD481AE2FC
Badda-bing! That BAA4E6D9-BF1A-4C83-ADF4-FDFD481AE2FC GUID matches our GVsn above. There's our guy. What is it with people deleting files and folders off this 2003SRV16 server? We should have a chat with their site admin...
The same exact steps will work for folders. Let's do it real fast this time and figure out where someone deleted the whole ‘importantsubfolder’:
C:\>wmic /namespace:\\root\microsoftdfs path DfsrReplicatedFolderInfo get ReplicatedFolderGuid,ReplicatedFolderName,ReplicationGroupName > rfinfo.txt
ReplicatedFolderGuid ReplicatedFolderName ReplicationGroupName
8722EF11-6466-4472-888F-11B8A57B68A4 replicatedfolder importantdata
C:\>wmic /namespace:\\root\microsoftdfs path DfsrIdRecordInfo WHERE (filename='importantsubfolder' and replicatedfolderguid= '8722EF11-6466-4472-888F-11B8A57B68A4' ) get filename,flags,updatetime,GVsn,clock > file.txt
FileName Flags GVsn UpdateTime
importantsubfolder 4 {97DA0CC3-DBB4-437F-BB6F-BE8A970FE318}-v31 20070823235332.792930-000
C:\>dfsrdiag guid2name /guid:97DA0CC3-DBB4-437F-BB6F-BE8A970FE318 /rgname:importantdata
Object Type : DfsrVolumeInfo
Computer : 2003SRV13.fabrikam.com
Volume Guid : 929F0871-54B9-11DB-B293-806E6F6E6963
Volume Path : C:
Volume SN : 1826913329
DB Guid : 97DA0CC3-DBB4-437F-BB6F-BE8A970FE318
Done! Looks a lot easier the second time around, doesn’t it?
Note that when a deletion is replicated in to a DFSR server, the file by default is moved to \DfsrPrivate\ConflictAndDeleted under the root of the replicated folder. If the delete was not replicated in, but instead was the result of a local deletion, the file is moved to the Windows Recycle Bin (unless you held down the SHIFT key while deleting, in which case the file is deleted for good). By default the quota on ConflictAndDeleted is 660 MB but that is configurable on the Advanced Tab in the replicated folder properties. In the same location un-checking the “Move deleted files to Conflict And Deleted folder” box will make it so deletions that are replicated in are actually deleted for good without being moved to ConflictAndDeleted.
The information about all the data residing in ConflictAndDeleted is contained in the \DfsrPrivate\ConflictAndDeletedManifest.xml file. When the quota is reached, files are purged from ConflictAndDeleted folder and the ConflictAndDeletedManifest.xml in the order that they were put there. This means you have a limited amount of time to catch a deletion and be able to restore it from ConflictAndDeleted.
There is a sample script for restoring data from ConflictAndDeleted. This is needed because the folder structure of deleted data is flattened and all data resides directory off the root of ConflictAndDeleted, and the filename is appended with the GVSN. The script reads the ConflictAndDeletedManifest.xml so it knows the original file names and folder structure. But you can also determine that using the DfsrConflictInfo WMI class. For example you can check for the presence of a file in a given server’s ConflictAndDeleted folder by running:
C:\>wmic /namespace:\\root\microsoftdfs /node:2003srv13 path DfsrConflictInfo where "filename like 'critical.doc% '" get * /format:textvaluelist
ConflictFileCount=1
ConflictPath=\\.\C:\replicatedfolder\importantsubfolder\critical.doc
ConflictSizeInBytes=881211
ConflictTime=20070823233010.000000-000
ConflictType=5
FileAttributes=32
FileName=critical.doc-{97DA0CC3-DBB4-437F-BB6F-BE8A970FE318}-v27
GVsn={97DA0CC3-DBB4-437F-BB6F-BE8A970FE318}-v27
MemberGuid=2C50672F-32A2-4D7D-AF44-88E1812F6E08
ReplicatedFolderGuid=8722EF11-6466-4472-888F-11B8A57B68A4
ReplicationGroupGuid=6143BD54-C9CC-42E1-A1FA-03BB34BF87F2
Uid={97DA0CC3-DBB4-437F-BB6F-BE8A970FE318}-v27
The trailing % wildcard is needed because the FileName property has the GVSN of the file appended to it. The ConflictPath property contains the original path and file name for the file before it was deleted.
Auditing – The Who
Now that we’ve covered when and where, the Windows Auditing subsystem can be used to tell you who via the Object Access Auditing setting. The important take away is that you need to have it in place and the audit descriptors set on your files and folders before you need them. Setting it up after the files have gone missing isn’t going to buy you anything. I can tell your eyes have started to glaze over on this post so I’m going to wrap it up here. J
To set up Object Access Auditing you can follow this checklist and set your critical replicated folders to audit EVERYONE for DELETE and process that with inheritance on down. We really don’t usually care about files being changed and certainly not added, but deletions drive end users nuts.
An important thing to understand about Auditing in Windows 2000 and 2003 is that it’s bound by some legacy limitations in the Event Log system (no longer true in Windows Server 2008 or Vista). Basically, you want to keep the total size of all your event logs at around 300MB total or they will become unstable. You’ll find that enabling Object Access Auditing is going to make your Security event logs wrap pretty often, so if it was only 256MB you wouldn’t have much time for forensics. You can get around this by using KB312571 to configure AutoBackupLogFiles and save off the logs when they wrap automagically. Then they can be backed up and deleted periodically with a scheduled task or whatever you like.
Auditing is not free – it costs CPU time, disk I/O, and can increase memory usage within LSASS. Please be sure to test this for your environment. Really.
For excellent tips and deep explanation of Windows Auditing, check out the Windows Security Logging and Other Esoterica MSDN blog by Eric Fitzgerald (who ran Auditing for many years as a Program Manager).
For everything Auditing, check out the Windows Server 2003 Auditing web portal
- Ned Pyle
Comments
Anonymous
September 05, 2007
A very technical but very informative article on finding out who deleted a file or folder in a FRS or DFS-R environment. Three things I took away from this article #1: Audit information is very important for tracking this kind of activity. It is poAnonymous
October 05, 2007
The comment has been removedAnonymous
June 13, 2008
Hello, Very good stuff! :) I had an issue where the Policies and Scripts folders are morphed as this: Policies. Policies_NTFRS_XXXXXX. Scripts. Scripts_NTFRS_XXXXX. These morphed folders appeared at the same time an administrator that did an authoritative restore of an OU. We wantsto have proofs. Is there a wayto know the "Where, When" those morphed folders appeared, and if possible who did it. Is there a Tag, in the idtable or elsewhere, that corresponds to a restoration of those morphed folders ? Thx for your input. YannAnonymous
June 13, 2008
Hi, The steps are pretty much identical as above except that you don't care about the FLAGS being set to deleted. There's nothing marking them as morphed bu the name, and you have that piece. An auth restore of an OU would not be able to cause this issue though - it would have no effect on FRS. But if someone was using GPMC incorrectly (KB929266) or if they set a D4 burflag as part of their steps without setting D2 downstream (Kb315457) it would be very possible that the issue happened at roughly the same time.Anonymous
June 15, 2008
The comment has been removedAnonymous
April 23, 2009
The comment has been removedAnonymous
April 24, 2009
Can you post that entry for me? I was reviewing source code to see, but FLAGS is used a lot to mean a lot of things.Anonymous
February 02, 2010
Hi Ned. We are seeing this flag also when querying DfsrIdRecordInfo. Can't seem to find any info on these flags other than what is in the table above. Very strange to see a 6 ! FileName Flags GVsn UpdateTime 110939.1 4 {7AF1E7E2-C4D6-4B16-BB54-A0D666D7B90D}-v519043 20100125035850.078084-000 110939.1 4 {7AF1E7E2-C4D6-4B16-BB54-A0D666D7B90D}-v519047 20100125040417.654113-000 110939.1 4 {7AF1E7E2-C4D6-4B16-BB54-A0D666D7B90D}-v581927 20100125093641.123998-000 110939.1 6 {7AF1E7E2-C4D6-4B16-BB54-A0D666D7B90D}-v605062 20100125124701.901408-000 110939.1 5 {7AF1E7E2-C4D6-4B16-BB54-A0D666D7B90D}-v606675 20100128005546.546834-000Anonymous
February 02, 2010
The comment has been removedAnonymous
February 04, 2014
This is a collection of the top Microsoft Support solutions to the most common issues experienced usingAnonymous
February 21, 2014
These are the top Microsoft Support solutions to the most common issues experienced using Microsoft WindowsAnonymous
February 24, 2014
We’ve gathered together the top Microsoft Support solutions for the most common issues experiencedAnonymous
May 09, 2014
These are the top Microsoft Support solutions for the most common issues experienced when using WindowsAnonymous
May 09, 2014
These are the top Microsoft Support solutions for the most common issues experienced when using WindowsAnonymous
May 09, 2014
This is a collection of the top Microsoft Support solutions for the most common issues experienced whenAnonymous
August 11, 2014
Top Microsoft Support solutions for the most common issues experienced when you use Windows Server 2008Anonymous
August 12, 2014
Top Microsoft Support solutions for the most common issues experienced when you use Windows Server 2008Anonymous
August 16, 2014
Top Microsoft Support solutions for the most common issues experienced when you use Windows Server 2008Anonymous
April 28, 2015
Top Microsoft Support solutions for the most common issues experienced when you use Windows Server 2008Anonymous
April 28, 2015
This is a collection of the top Microsoft Support solutions for the most common issues experienced whenAnonymous
May 15, 2015
This is a collection of the top Microsoft Support solutions for the most common issues experienced whenAnonymous
May 15, 2015
This is a collection of the top Microsoft Support solutions for the most common issues experienced when