Freigeben über


MSExchangeRepl 2147 / MSExchangeRepl 2104 / MSExchangeRepl 2127 occurring on Windows 2008 or Windows 2008 R2 with Exchange 2007 Cluster Continuous Replication (CCR)

When Exchange 2007 CCR is installed on Windows 2008 or Windows 2008 R2 the following error may be noted in the application log of the passive node:

Log Name: Application
Source: MSExchangeRepl
Event ID: 2104
Task Category: Service
Level: Error
Keywords: Classic
User: N/A
Computer: MACHINE
Description:
Log file action LogCopy failed for storage group EXCLUST01\SG2. Reason:
CreateFile(\\Server\StorageGroupGUID$\LogFile.log) = 2

If the CCR cluster is not utilizing continuous replication host names the following event series may also be noted:

Event ID : 2147
Raw Event ID : 2147
Source : MSExchangeRepl
Type : Error
Machine : SERVER
Message : There was a problem with 'ActiveNode', which is an alternate name for 'ActiveNode'. The list of aliases is now 'ActiveNode', and the alias 'was' removed from the list. The specific problem is 'CreateFile(\\ActiveNode\StorageGroupGuid$\LogFile.log) = 2'.

ID: 2127
Level: Information
Provider: MSExchangeRepl
Machine: SERVER
Message: The system has detected a change in the available replication networks. The system is now using network 'ActiveNode' instead of network 'ActiveNode' for log copying from node ActiveNode.

In this situation if the solution is aggressively monitored you may not that replication is temporarily failed and then resumes automatically as healthy. This occurs due to a temporary pause in replication when the error condition is detected, while the replication service attempts to find other replication paths, and then automatically re-attempts the same copy operation.

If the CCR cluster is utilizing continuous replication host names the following event series may also be noted:

Event ID : 2147
Raw Event ID : 2147
Source : MSExchangeRepl
Type : Error
Machine : SERVER
Message : There was a problem with ‘ReplicationHostName’, which is an alternate name for 'ActiveNode'. The list of aliases is now 'ActiveNode', and the alias 'was' removed from the list. The specific problem is 'CreateFile(\\ReplicationHostName\StorageGroupGUID$\LogFile.log) = 2'.

ID: 2127
Level: Information
Provider: MSExchangeRepl
Machine: SERVER
Message: The system has detected a change in the available replication networks. The system is now using network 'ActiveNode' instead of network ‘ReplicationHostName’ for log copying from node ActiveNode.

Error 2 is ERROR_FILE_NOT_FOUND

In this situation the error is detected on the replication host name. The replication service will temporarily pause replication while other network paths are enumerated. If other continuous replication host names are in use, the replication serivce will select an alternate replication host name and automatically resume log copying. If the only path valid is the “public” path, the replication service will begin copying log files over the “public” network. Eventually this error occurs on the public network, forcing network re-enumeration to occur and replication to automatically switch back to the replication network. If the solution is aggressively monitored, the replication status may be failed during this switch but will automatically resume healthy.

In almost all incidences these errors are considered benign to the operation of the Exchange Server.

The replication service is extremely aggressive in its attempts to copy log files. The replication service is always aware of the next log file in the series that requires copying to the passive node. As part of normal processes the replication service may query multiple times for the presence of this file and make copy attempts. These attempts may result in the replication service querying for a log file that is not fully available. Under Windows 2003 this was not necessarily an issue. Windows 2008 introduces a component into SMBv2 that may cause this to be a problem.

SMBv2 introduces status caching into the LanManWorkstation service.  When an application requests information from a file share, the workstation service caches the response from the server hosting the share.  Subsequent requests for the same information are returned from cache rather than re-contacting the server hosting the share.  Eventually this cache will expire (in our case it expires by the time replication is failed / resumed <or> a switch between replication host names occur).  The replication service has received feedback that the log file in question should not be available for copy, attempts to copy it, and receives an older return status that the file is not ready (even though the file does exist on the source at the time the attempt is made).  In turn the replication service detects this as an error condition and takes action.

From a Windows 2008 / Windows 2008 R2 perspective this is by design.

To correct these errors on an Exchange 2007 / Windows 2008 <or> Exchange 2007 / Windows 2008 R2 implementation, the following registry keys should be set to a zero (0) value and the nodes rebooted:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Lanmanworkstation\Parameters

FileInfoCacheLifetime [DWORD]

FileNotFoundCacheLifetime [DWORD]

DirectoryCacheLifetime [DWORD]

If the DWORDs are not present they may need to be created.  The recommended value is HEX / DEC 0.

More information on these keys can be found here: https://technet.microsoft.com/en-us/library/ff686200(WS.10).aspx  (Note that registry path in the article is missing the SERVICES hive – correct path in blog post).

Comments

  • Anonymous
    January 01, 2003
    @Davis: Can you be more specific, the only error code to intepret in this blog post is the error 2. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Stef: If the error is the same as referenced here it will not be fixed in an Exchange rollup.  Editing the keys is the only way to disable the functionaility in windows that causes this condition. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Anonymous: Your question about whether or not this can apply to SCR.  It can apply to SCR as SCR is monitoring for similar notifications. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Harry: A 64 bit entry is a QWORD.  We need to use 32 bit or DWORD. Tim

  • Anonymous
    October 13, 2010
    Hello Tim, Thanks for the info. Can you please let me know we should create FileInfoCacheLifetime [DWORD 32bit or 64 bit]  for these keys? Thanks in advance!

  • Anonymous
    November 30, 2010
    Is this error likely to occur with SCR replication as well?

  • Anonymous
    December 01, 2010
    Hello TIMMCMIC, You have mentioned that Error 2 is "ERROR_FILE_NOT_FOUND" Do you have any list that has the other error codes. can you tell me what are all the error codes & what do they correspond to. Daivs

  • Anonymous
    March 08, 2011
    Those errors seem to be solved by exchange 2007 SP2 rollup 2 (support.microsoft.com/.../en-us)

  • Anonymous
    June 16, 2011
    These errors are known to cause any issues with backup and log file truncation ? logfiles are not truncating and we only we see these errors once backups are finished.

  • Anonymous
    July 13, 2011
    Getting 2026 events along with 2147. After adding the above registry key 2147 events stopped. What is this event and how to resolve it.  Exchange 2007 SP2 RU4. The Exchange writer status goes to failed state as well. Time:     13-07-2011 15:34:42 ID:       2026 Level:    Error Source: MSExchangeRepl Machine:   Message:  The Microsoft Exchange Replication Service VSS writer (instance 5c2e6cec-f200-4714-ad20-37d095e09473) failed with error code C7FF07D7 when preparing for snapshot.

  • Anonymous
    July 30, 2012
    Tim, this saved my life today. Thanks.