DFSR no longer replicates files after restoring a virtualized server's snapshot

This article discusses an issue where the Distributed File System Replication (DFSR) service fails to replicate files after restoring a virtualized server's snapshot.

Original KB number:   2517913

Symptoms

Using any virtualization product, you create a guest snapshot of a server replicating files with DFSR. You later restore that snapshot, returning the server to an earlier point in time.

You notice the following behaviors on the restored server:

  • No files replicate inbound or outbound for several minutes, then DFSR events 5014 and 5004 are logged indicating replication is resuming.

  • Any files created, deleted, or modified after the snapshot was taken but prior to restoration replicate inbound.

  • Any files created, deleted, or modified after the restoration don't replicate outbound.

  • Any changes to files on partner servers will replicate inbound, regardless of up-to-dateness, overwriting all changes made locally and potentially deleting newer data.

  • After a period of time, the DFSR databases will write errors and warnings in the event log and rebuild automatically. After the rebuild completes successfully, DFSR will again log internal errors and rebuild the database. This will continue infinitely.

    Log Name: DFS Replication
    Source: DFSR
    Date: <DateTime>
    Event ID: 2212
    Task Category: None
    Level: Warning
    Keywords: Classic
    User: N/A
    Computer: 2008r2-06-f.contoso.com
    Description:
    The DFS Replication service has detected an unexpected shutdown on volume C:. This can occur if the service terminated abnormally (due to a power loss, for example) or an error occurred on the volume. The service has automatically initiated a recovery process. The service will rebuild the database if it determines it cannot reliably recover. No user action is required.

    Additional Information:
    Volume: C:
    GUID: <GUID>
    Log Name: DFS Replication
    Source: DFSR
    Date: <DateTime>
    Event ID: 2104
    Task Category: None
    Level: Error
    Keywords: Classic
    User: N/A
    Computer: 2008r2-06-f.contoso.com
    Description:
    The DFS Replication service failed to recover from an internal database error on volume C:. Replication has been stopped for all replicated folders on this volume.

    Additional Information:
    Error: 9214 (Internal database error (-1605))
    Volume: 92404560-E6C8-11DF-BCA2-806E6F6E6963
    Database: C:\System Volume Information\DFSR
    Log Name: DFS Replication
    Source: DFSR
    Date: <DateTime>
    Event ID: 2004
    Task Category: None
    Level: Error
    Keywords: Classic
    User: N/A
    Computer: 2008r2-06-f.contoso.com
    Description:
    The DFS Replication service stopped replication on volume C:. This failure can occur because the disk is full, the disk is failing, or a quota limit has been reached. This can also occur if the DFS Replication service encountered errors while attempting to stage files for a replicated folder on this volume.

    Additional Information:
    Error: 9014 (Database failure)
    Volume: 92404560-E6C8-11DF-BCA2-806E6F6E6963
    Log Name: DFS Replication
    Source: DFSR
    Date: <DateTime>
    Event ID: 2106
    Task Category: None
    Level: Information
    Keywords: Classic
    User: N/A Computer: 2008r2-06-f.contoso.com
    Description:
    The DFS Replication service successfully recovered from an internal database error on volume C:. Replication has resumed on replicated folders on this volume.

    Additional Information:
    Volume: 92404560-E6C8-11DF-BCA2-806E6F6E6963
    Database: C:\System Volume Information\DFSR

Any servers that replicate with the restored computer will repeatedly show in their %systemroot%\debug\dfsr*.log files:

20110302 11:05:26.068 1192 INCO 7487 InConnection::RestartSession Retrying establish contentset session. connId:{1B7F0404-6B47-4575-97CE-B107D9DEE1FE} csId:{E027985A-B48E-4B96-9F65-23D3EAADE871} csName:snaprf
20110302 11:05:26.068 1192 INCO 1042 [WARN] SessionTask::Step (Ignored) Failed, should have already been processed. Error:
+ [Error:9027(0x2343) InConnection::EstablishSession inconnection.cpp:6172 1192 C A failure was reported by the remote partner]
+ [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4200 1192 C A failure was reported by the remote partner]
+ [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 1192 C A failure was reported by the remote partner*]
+ [Error:9028(0x2344) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 1192 C The content set was not found]
20110302 11:07:26.080 1192 DOWN 4186 [ERROR] DownstreamTransport::EstablishSession Failed on connId:{1B7F0404-6B47-4575-97CE-B107D9DEE1FE} csId:{E027985A-B48E-4B96-9F65-23D3EAADE871} rgName:snapshotrg Error:
+ [Error:9027(0x2343) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 1192 C A failure was reported by the remote partner]
+ [Error:9028(0x2344) DownstreamTransport::EstablishSession downstreamtransport.cpp:4179 1192 C The content set was not found]

Cause

Snapshots aren't supported by the DFSR database or any other Windows multi-master databases. This lack of snapshot support includes all virtualization vendors and products. DFSR doesn't implement USN rollback quarantine protection like Active Directory Domain Services.

Under no circumstances should you create or restore snapshots of computers running DFSR on read-write members in a production environment.

Snapshot restore is only supported for read-only members as their version vector isn't tracked on partners and a USN rollback can't happen.

Resolution

To resolve this issue, contact Microsoft Support. The resolution involves special database recovery steps that can be used to fix the affected server without impacting other computers.

Recreating the replication group or replicated folder will not fix the issue on the restored server and shouldn't be used as a troubleshooting step.

More information

For more information around snapshots and USN rollback protection, review: