Exchange 2010: Log Truncation and Checkpoint At Log Creation in a Database Availability Group

In previous versions of Exchange, when a backup was completed, almost all log files prior to the current log file were truncated from the system.  Administrators monitoring the directory would originally see many logs, and post backup note that only a few logs remained.  In Exchange 2010 Service Pack 1 and later administrators note that multiple log files remain on the disk post backup or the appearance that no log files have truncated at all.  In many cases this leads to a belief that logs are actually not truncating successfully or that there is an issue with backups.

 

Why do we see logs remaining on disk for longer?  Exchange 2010 SP1 and newer introduces a change in the behavior of log truncation.  The changes were taken to ensure that replicated copies of databases within a database availability group always had the appropriate log files on the source server to complete an incremental resynchronization. 

 

The change to log truncation is the tracking of Checkpoint At Log Creation.  Remember that in a database availability group we can expect the checkpoint to be approximately 100 logs (or slightly more) off the current log file – this is known as checkpoint depth.  As Exchange creates new log files we stamp into the header of the new log files what log file the checkpoint was pointing at when the current log was created.  For example, let us say that log file 0xA679 (42617) was just created as the current ENN.log.  We can expect that the checkpoint at log creation value stamped within the header of this log file would be approximately 0xA16 (42517).  You can see the checkpoint at log creation value by using eseutil /ml <logfilename> to dump the header of a log file.

 

[PS] P:\DAG\DAG-DB0\DAG-DB0-Logs>eseutil /ml .\E020000A67E.log

Extensible Storage Engine Utilities for Microsoft(R) Exchange Server
Version 14.02
Copyright (C) Microsoft Corporation. All Rights Reserved.

Initiating FILE DUMP mode...

      Base name: E02
Log file: .\E020000A67E.log
lGeneration: 42622 (0xA67E)
Checkpoint: (0xA679,8,0)
creation time: 03/11/2012 06:00:48
prev gen time: 03/11/2012 04:01:17
Format LGVersion: (7.3704.16.2)
Engine LGVersion: (7.3704.16.2)
Signature: Create time:05/02/2010 18:04:08 Rand:399094376 Computer:
Env SystemPath: d:\DAG\DAG-DB0\DAG-DB0-Logs\
Env LogFilePath: d:\DAG\DAG-DB0\DAG-DB0-Logs\
Env Log Sec size: 512 (matches)
Env (CircLog,Session,Opentbl,VerPage,Cursors,LogBufs,LogFile,Buffers)
( off, 1027, 51350, 16384, 51350, 2048, 2048, 29487)
Using Reserved Log File: false
Circular Logging Flag (current file): off
Circular Logging Flag (past files): off
Checkpoint at log creation time: (0xA679,8,0)
1 d:\DAG\DAG-DB0\DAG-DB0-Database\DAG-DB0.edb
dbtime: 18078306 (0-18078306)
objidLast: 2957
Signature: Create time:05/02/2010 18:04:08 Rand:399127765 Computer:
MaxDbSize: 0 pages
Last Attach: (0xA348,9,86)
Last Consistent: (0xA346,9,B5)

      Last Lgpos: (0xa67e,252,0)

Number of database page references: 770

Integrity check passed for log file: .\E020000A67E.log

Operation completed successfully in 0.265 seconds.

 

In the previous example the checkpoint at log creation is 0xA679.

 

Within a DAG all servers that contain a replicated copy of a database report the maximum log file that is eligible for truncation.  These values are reported to the active node which subsequently calculates the maximum log file for truncation.  In simplest terms the following process occurs:

 

  • Passive copy on Node-2 reports OK to truncate log 0xA679 (42617).
  • Passive copy on Node-3 reports OK to truncate log file 0xA678 (42616)
  • Passive copy on Node-4 reports ok to truncate log 0xA679 (42617).
  • The active node determines that the best log file eligible for truncation based on the passive copies is 0xA678 (42616).  [This is essentially the minimum of all reported OK logs to truncate.]
  • The active node then looks at the checkpoint at log creation of 0xA678 (42616) and determines that value is 0xA614 (42516).  In this example that would be 100 logs off the best log reported for truncation of the passive copies.
  • The active node sets the truncation point to be log 0xA614 (42516).
  • Therefore after a successful backup logs prior to 0x614 (42516). would truncate.

 

This essentially means that 100 additional logs that would have previously truncated prior to this change do not truncate.

 

Taking into account checkpoint at log creation administrators can better understand how log files are truncated and why log files remain on disk after a backup that might have in prior versions been truncated.

 

============================

Update 5/16/2012

Corrected hex conversions in example.

============================

Comments

  • Anonymous
    January 01, 2003
    srs

  • Anonymous
    January 01, 2003
    @Andreas: Thanks - problem between keyboard in chair. I fixed it up and updated it online. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Jeff:
    Yes

    TiMMCMIC

  • Anonymous
    January 01, 2003
    Hello Michael, First of all, thank you for such detailed and interesting article! I have a question. Is there any way to manipulate with Checkpoint At Log Creation time? Is it possible to influence on its value? Some of our customers have problems after backup, because there copies sporadically become FailedAndSuspended with "Log file 1 missing" error. Since full backup is intended to truncate old log files, there should be such ancient "log file 1", because it was truncated long time ago. The behavior repeats sporadically, not always, on different databases. MS Partner Support does not what to do with it, and the backup software vendors also give up finding a solution for the problem. Fortunately, it is possible to resume the FailedAndSuspended copy manually in EMC. But we need to find a solution to get rid of the problem and routine to manually resume the copy every time it fails after a backup.  

  • Anonymous
    January 01, 2003
    Hey Tim, can we expect the same behavior with Exchange 2013?

  • Anonymous
    January 01, 2003
    If the backup interrupted ,After a successful snapshot, It truncating the logs, If we set backupSucceeded as false. What may be the problem ?.

  • Anonymous
    January 01, 2003
    @MB...
    Yes - this sounds about right. When I first encountered this issue was in a customers test environment where they were generating a handful of log files but saw no significant truncation.

    TIMMCMIC

  • Anonymous
    January 01, 2003
    @Sunil...

    NP

    TIMMCMIC

  • Anonymous
    January 01, 2003
    @Michael: To my knowledge there is no difference between the versions documented here and SP3.  Also - as far as I know - there is no correlation between the number of items that the index is backlogged and how we truncate log files.   There should be no expectation that the backup solution has any impact on how logs are truncated, as the information store / replication service is what's responsible for determining log file truncation. TIMMCMIC

  • Anonymous
    January 01, 2003
    @Anonymous...I'm not sure what you are asking. If a backup is interrupted for any reason, log file truncation is not allowed to occur.TIMMCMIC

  • Anonymous
    January 01, 2003
    d

  • Anonymous
    May 15, 2012
    Dec 42516 is Hex A614 and not A15 - at least using windows 7 calculator :)

  • Anonymous
    October 24, 2012
    We can expect that the checkpoint at log creation value stamped within the header of this log file would be approximately 0xA16 (42517). 42517 - 0xA615 in Hex.

  • Anonymous
    August 05, 2013
    With Exchange Exchange 2010 SP3 we have seen the amount of Logfiles that stay after successfull back on A DAG around 200+. Prior to SP3 this may have been 100+. This is also confusing and seeme to be undocumented on MS side which clearly makes a lot of trouble on the market. There are hundres of threads and blogs who describe this and tend to go in direction of search indexer/Catalog files blocking. What Software you finally use Windows Backup, Netapp Snap Manager for Exchange or any Exchange Aware Backup seems to have no influence on the amount of logfiles the DAG keeps even after backup. Make sure you have enough Test data in your Test mailbox (Attachments).

  • Anonymous
    October 17, 2013
    Hello, Tim . Do you know if CACL and GTP still on Exchange 2013 ?

  • Anonymous
    October 22, 2013
    Yes. TIMMCMIC

  • Anonymous
    December 16, 2013
    Pingback from Exchange Server 2010 DAG??????????????????????????????????????????????????????????????????????????? | Exchange Server Blog

  • Anonymous
    March 18, 2014
    The comment has been removed

  • Anonymous
    April 21, 2014
    thanks...

  • Anonymous
    May 12, 2014
    hi, for someone new to Exchange 2010/DAG, how can we check and prevent too much db size growth? my log disk size is acceptable but not the disk where db resides.

  • Anonymous
    September 09, 2014
    @Valery D,

    I am having a similar situation to what you explained above about Sporadic FailedAndSuspended with "Log file 1 missing" error. Can you please tell me if you found a solution to this problem?

    Thank......