Compartir a través de


Revealing the secrets - Truncating a Transaction Log File

In my earlier blog post on Shrinking the Transaction Log files in SQL Server, I had mentioned that a VLF can be truncated only when all transactions, that have at least one command written into the specified VLF have either been committed or rolled back.

While this is true, I understand that it might create a bit of confusion around how the truncation actually happens behind the covers. In his comment, David writes:
Also, I have read in the documentation (Transaction Log Physical Architecture) that truncation of the log marks any VLFs as inactive "whose [log] records all appear in front of the ... MinLSN". Shouldn't that continue on and say "or after the last-written log record"? Or does "in front of" include the tail-end VLFs? Please excuse me, I am having a hard time "wrapping my mind around" the logical log.

This post is essentially to help David, and many more of my readers understand the logic and the under-the-covers steps that SQL Server does while truncating a Transaction Log File. Hence, I will try and demonstrate, with examples, a Log File Truncation.

For demonstration purposes, let us create a new database, and make sure that the database is in FULL recovery model. Why Full Recovery Model – because in Full Recovery Model, truncation will happen only after we take a log backup, and hence, will be completely under our control. Lets first create a blank new database using the script below:

 CREATE DATABASE TLogTruncationDemo
ON PRIMARY (
    NAME = 'TLogTruncationDemo_Data',
    FILENAME = 'F:\TLogTruncationDemo\TLogTruncationDemo_Data.mdf',
    SIZE = 10 MB,
    MAXSIZE = 100 MB )
LOG ON (
    NAME = 'TLogTruncationDemo_Log',
    FILENAME = 'F:\TLogTruncationDemo\TLogTruncationDemo_Log.ldf',
    SIZE = 2 MB,
    /* Set the MAXSIZE = SIZE so that the T-Log does not grow. */
    MAXSIZE = 2 MB )
GO

Now, before we proceed any further, let us change the Recovery Model of the database to Full, and then lets take a Full Backup, so that the Recovery Model change takes effect.

 ALTER DATABASE TLogTruncationDemo SET RECOVERY FULL
GO
BACKUP DATABASE TLogTruncationDemo 
    TO DISK = 'F:\TLogTruncationDemo\TLogTruncationDemo_Backup_01.bak'
GO

Lets now examine the Transaction Log File. We will execute the following command:

 

 DBCC LOGINFO(TLogTruncationDemo)
GO

 

We get the following result:

 

TLog01

Now, execute the following command:

 DBCC LOG(TLogTruncationDemo)
GO

We get the following results:

 

TLog02

These two commands show us that the Log Sequence Number that we are currently on is 00000014:00000051:0003. Moreover, since VLF 1 is the only VLF currently in use, all the LSNs are contained in VLF1.

 

Now, lets create a new table and insert a few rows in the table. We will also issue a manual CHECKPOINT, and then examine the contents of the Log File:

 

 CREATE TABLE TblTLogDEMO (
    C1 INT IDENTITY(1, 1) NOT NULL,
    C2 INT NOT NULL,
    C3 VARCHAR(10) NOT NULL )
GO
INSERT INTO TblTLogDEMO (C2, C3) VALUES (1, 'A')
INSERT INTO TblTLogDEMO (C2, C3) VALUES (2, 'A')
INSERT INTO TblTLogDEMO (C2, C3) VALUES (3, 'A')
GO
CHECKPOINT
GO

At this point, DBCC LOGINFO still shows us that we are working in VLF 1. However, DBCC LOG shows us many more entries after the last recorded LSN (00000014:00000051:0003). Also, notice that the last two entries show the following:

 

 Current LSN             Operation       Context   Transaction ID
----------------------- --------------- --------- --------------
00000014:0000006e:009e  LOP_BEGIN_CKPT  LCX_NULL  0000:00000000
00000014:000000af:0001  LOP_END_CKPT    LCX_NULL  0000:00000000

00000014:0000006e:009e is the LSN when the CHECKPOINT started and it ended at LSN 00000014:000000af:0001. Behind the covers, at LSN 00000014:000000af:0001, i.e., at LOP_END_CKPT, SQL Server also records another LSN, which is called the Minimum Recovery LSN (MinLSN).

The Minimum Recovery LSN (MinLSN) is the LSN of the first log record that must be present for a successful database-wide rollback. The MinLSN is the minimum of the:

  • LSN of the start of the checkpoint.
  • LSN of the start of the oldest active transaction.
  • LSN of the start of the oldest replication transaction that has not yet been delivered to the distribution database.

At this point, since we do not have replication configured, and we do not have any active transaction either, the MinLSN will be equal to the LSN of the start of the CHECKPOINT process, i.e., 00000014:0000006e:009e.

EXPERIMENT 1 :: COMMIT, CHECKPOINT, BACKUP LOG

We will perform the following activities:

 BEGIN TRAN Demo_Tran_01
GO
INSERT INTO TblTLogDEMO (C2, C3) VALUES (1, 'A')
GO 3050
COMMIT TRAN Demo_Tran_01
GO
CHECKPOINT
GO

 

Now, we will find, on execution of the DBCC LOGINFO command, that the LSNs have completely utilized VLF 1 and have moved into VLF 2. Again, since we have committed the transaction before the CHECKPOINT, the MinLSN corresponding to LOP_END_CKPT will be the LSN of LOP_BEGIN_CKPT, which in my case is 00000015:000000eb:0007.

 

 Current LSN             Operation       Context   Transaction ID
----------------------- --------------- --------- --------------
00000015:000000eb:0007  LOP_BEGIN_CKPT  LCX_NULL  0000:00000000
00000015:000000ee:0001  LOP_END_CKPT    LCX_NULL  0000:00000000

At this point, we can clearly understand that at this stage, all the LSNs in VLF 1 are less than the MinLSN, while VLF 2 has at least 2 LSN that are either equal to or greater than the MinLSN.

When a Transaction Log File is truncated, all entries having “Current LSN” less than the MinLSN are deleted. Later, any VLF, that becomes completely empty is marked as Inactive. In other words, VLFs that consist only of LSNs less than the MinLSN are marked as Inactive and Truncated.

In our case, VLF 1 consists only of LSNs less than the MinLSN, and hence only VLF 1 should be truncated. To verify this, lets take a Log Backup of the database using the command below:

 BACKUP LOG TLogTruncationDemo
    TO DISK = 'F:\TLogTruncationDemo\TLogTruncationDemo_LogBackup_01.trn'
GO

Now, when we execute the DBCC LOGINFO command, we see that VLF 1 has been marked as Inactive:

 

TLog03

EXPERIMENT 2 :: CHECKPOINT, BACKUP LOG, COMMIT

In the second part of the exercise, let us begin a transaction and then continue inserting records, so that the commands completely fill up VLF 2 and moves on into VLF 3. Later, we will fire a manual CHECKPOINT. We use the script below:

 BEGIN TRAN Demo_Tran_02
GO
INSERT INTO TblTLogDEMO (C2, C3) VALUES (1, 'A')
GO 3000

CHECKPOINT
GO

Now, when we fire the CHECKPOINT, there is an active transaction, that started at LSN 00000015:000000ef:0002 (LOP_BEGIN_XACT), while the LSN corresponding to the last LOP_BEGIN_CKPT is 00000016:00000100:00d7. Since, 00000015:000000ef:0002 (LOP_BEGIN_XACT) is less than 00000016:00000100:00d7 (LOP_BEGIN_CKPT), the MinLSN will be equal to 00000015:000000ef:0002, which is the LSN of the start of the active transaction.

 Current LSN             Operation        Context   Transaction ID
----------------------- ---------------- --------- --------------
00000015:000000ef:0002  LOP_BEGIN_XACT   LCX_NULL  0000:00000239
00000016:00000100:00d7  LOP_BEGIN_CKPT   LCX_NULL  0000:00000000
00000016:00000123:0001  LOP_XACT_CKPT    LCX_NULL  0000:00000000
00000016:00000123:0002  LOP_END_CKPT     LCX_NULL  0000:00000000

Also, 00000015:000000ef:0002 (LOP_BEGIN_XACT) is recorded in VLF 2 and the LSN of the CHECKPOINT end (LOP_END_CKPT), 00000016:00000123:0002, is recorded in VLF 3. So, now, both VLF 2 and VLF 3 have LSN entries greater than the MinLSN; and hence a Log Backup should not be able to truncate the VLFs. Lets verify:

 BACKUP LOG TLogTruncationDemo
    TO DISK = 'F:\TLogTruncationDemo\TLogTruncationDemo_LogBackup_02.trn'
GO

TLog04

Now, commit the transaction and then manually fire a CHECKPOINT. Take a Log Backup, and check the status of the log once again:

 COMMIT TRAN Demo_Tran_02
GO
CHECKPOINT
GO


BACKUP LOG TLogTruncationDemo
    TO DISK = 'F:\TLogTruncationDemo\TLogTruncationDemo_LogBackup_03.trn'
GO
DBCC LOGINFO(TLogTruncationDemo)
GO

TLog05

EXPERIMENT 3 :: CHECKPOINT, COMMIT, BACKUP LOG

In this experiment, lets begin a new transaction and insert some rows, so that the entries start from VLF 3 and continue into VLF 4. The commands and the result is as below:

TLog06

At this point, the MinLSN is the again the begin LSN of the start of the transaction Demo_Tran_03. This is because, it is the minimum of the starting LSN of the active transaction and the starting LSN of the last CHEKCPOINT process. Now, if we commit the transaction, and then take a Log Backup, what do we expect?

According to our theory, since both VLF 3 & 4 contain entries that are greater than the MinLSN, both should remain active, isn’t it? Lets check…

TLog07 
What’s going on? Actually behind the covers, whenever you take a backup (Full, Differential or Log Backup), SQL Server fires a CHECKPOINT again. This CHECKPOINT, obviously, comes after the Commit of the Transaction, and hence, the MinLSN for this CHECKPOINT is equal to the starting LSN of the CHECKPOINT. Now, things fall in place, right?

EXPERIMENT 4 :: WRAPPING AROUND THE LOG FILE

Now, let us execute the following queries:

 BEGIN TRAN Demo_Tran_04
GO
INSERT INTO TblTLogDEMO (C2, C3) VALUES (1, 'A')
GO 5000
COMMIT TRAN Demo_Tran_04
GO
DBCC LOGINFO(TLogTruncationDemo)
GO
BACKUP LOG TLogTruncationDemo
    TO DISK = 'F:\TLogTruncationDemo\TLogTruncationDemo_LogBackup_05.trn'
GO
DBCC LOGINFO(TLogTruncationDemo)
GO

TLog08

We can see here that the Log File has wrapped around. How do we know? We know that the transactions have wrapped around the T-Log File by examining the FSeqNo column in the DBCC LOGINFO output. Till the end of Experiment 3, the FSeqNo column for VLF 1 had a value of 20, and now, it has been changed to 24, i.e., FSeqNo for VLF 4 + 1. This shows us that this VLF is being used after VLF 4 was full.

CONCLUSION:

Hence, when in the article Transaction Log Physical Architecture, we say “Log truncation frees any virtual logs whose records all appear in front of the minimum recovery log sequence number (MinLSN)”, what we essentially mean is:

While truncating the Transaction Log, we delete all records with “Current LSN” less than the MinLSN. At this point, any VLF that becomes completely empty, i.e., VLFs that consists only of LSN entries less than the MinLSN, is marked as free (inactive).

Disclaimer: All information provided here is my personal opinion and is neither verified nor approved by Microsoft before it is published. All information, and code samples, if any, is provided "AS IS" with no warranties and confers no rights.

Comments

  • Anonymous
    July 20, 2009
    Suhas, Thank you for taking the time to explain this here and in the last post.  In the past I was never sure why my efforts to shrink a log file on a full recovery model database did not work.  Sometimes it just seemed that I needed to repeat a log backup a few times first.  Although I would like to step through this second post for myself, I believe that I understand what is going on now. I am also in a better position to question the many recommendations available on the web, such as “switch to the simple recovery model, shrink, switch back to the full model and then take a database backup”. I have a follow-up question on best practice for shrinking.  With a full recovery model database, I can see that a log backup and then SHRINKFILE is a minimum requirement.  But should I do a CHECKPOINT beforehand to reduce the size of the active log?  Also is a log backup with TRUNCATE_ONLY acceptable in replacement of a real log backup? In any case I hope other DBAs discover these posts on shrinking the transaction log.  Thank you! David

  • Anonymous
    July 20, 2009
    Hi David, Thanks for the comments. To start off, I would like to mention that SHRINK-ing the T-Log is not a recommended practise, specially on production systems, unless the T-Log has grown suddenly after a heavy operation. Growing a database file is always a costly operation and should be avoided as much as possible. To answer your questions, manual CHECKPOINT is not necessary. SQL Server should automate the CHECKPOINT process based on the recovery interval mentioned in sp_configure (unless you are hitting the bug as mentioned in KB909369 [http://support.microsoft.com/kb/909369]). Moreover, we recommend that you never use NO_LOG or TRUNCATE_ONLY to manually truncate the transaction log, because this breaks the log chain. Until the next full or differential database backup, the database is not protected from media failure. Use manual log truncation in only very special circumstances, and create backups of the data immediately. Also, this option has been removed in SQL Server 2008. Avoid using it in new development work, and plan to modify applications that currently use it.
    Ref: http://msdn.microsoft.com/en-us/library/ms186865(SQL.90).aspx Thanks,
    Suhas

  • Anonymous
    July 21, 2009
    Suhas, Yes, shrinking the log should be a rare operation.  A DBA should either 1) find out what the normal size should be and leave it that size, 2) find out why the log file grew too large to begin with and fix the real problem, 3) buy a larger disk, or add another disk.  But sometimes -- you are pushed into doing something you would rather not do. TRUNCATE_ONLY was another one of those recommendations I had seen on the web and thought I had better ask about.  But as you pointed out, it is clearly discussed in the documentation. Also, I appreciate your engineering/experimental approach to answering questions, such as this post. Thanks, David

  • Anonymous
    September 21, 2009
    Suhas, Thank you for your posts on TLOG Truncation.  They have been very helpful. I have a situation where a DB's TLOG is in the SIMPLE recovery model, yet will not truncate.  I ran DBCC OpenTran which showed that there were no Active transactions. Since the TLOG was not truncating automatically, I ran the CHECKPOINT statement as well as BACKUP LOG WITH TRUNCATE_ONLY, but the log would not truncate.  Replication is not set up on this system.  Also at this point, the TLOG had filled up the drive it was located on. Q: Is the DBCC OpenTran command the correct indicator of whether it is possible for the VLF's to be truncated?  I believe that I met all of the criteria you described in your previous post for a SIMPLE log to be truncated. Thank you for any help in advance.

  • Anonymous
    September 22, 2009
    The comment has been removed

  • Anonymous
    October 15, 2009
    Suhas, We have not had our situation return (TLOG not truncating) since your post, so I have not been able to see the reason. The 'log_reuse_wait_desc' column will be very helpful for this situation.   Thank you again for your reply and the insightful article. Phil

  • Anonymous
    August 12, 2010
    Excellent information!! You made this so easy to understand and implement. Especially thanks for clearing the popular DBA myth of 'Truncating T-log file before shrinking' Keep the good work Cheers!! Varun

  • Anonymous
    August 15, 2010
    Hi Suhas, If we use dbcc log(dbname, 3), we can find a column named "Minimum LSN"  in LOP_END_CKPT row. The value of "Minimum LSN" is always equal to  LOP_BEGIN_CKPT's Current LSN. Is "Minimum LSN" just same as MinLSN in this artical?If so, why does "Minimum LSN" not equal to the oldest transaction's LSN? Thanks!

  • Anonymous
    August 16, 2010
    Hi Misterliwei The Minimum Recovery LSN (MinLSN) that I have discussed about in the article is not the same as "Minimum LSN" as you have indicated. The "Minimum LSN" indicated by you will always be equal to the "Current LSN" of LOP_BEGIN_CKPT. The Minimum Recovery LSN (MinLSN) is the LSN of the first log record that must be present for a successful database-wide rollback. If log records above LOP_BEGIN_CKPT are not available, recovery of this database will fail and the database will move into SUSPECT mode. For your information, if you notice the output of the command, if you have open transactions when CHECKPOINT gets invoked, an additinal row is written into the Transaction Log (apart from LOP_BEGIN_CKPT and LOP_END_CKPT). The Operation is called "LOP_XACT_CKPT". If you look at the description of this row, it will list the Transaction IDs of all active transactions when CHECKPOINT was invoked. In a sample test I did, the Description said: "XdesIDs: 0000:00397329,0000:0039732a"; this means that there are 2 open transactions with Transaction IDs 0000:00397329 and 0000:0039732a. Obviously, 0000:00397329 is less than 0000:0039732a and hence 0000:00397329 is the Oldest Active Transaction. During recovery, SQL Server will look for a record with "Operation" = "LOP_BEGIN_XACT"  and "Transaction ID" = "0000:00397329" and will start recovery from this LSN. The "Current LSN" of this record is what I have referred to MinLSN or Minimum Recovery LSN. Hope this helps!! Suhas

  • Anonymous
    August 17, 2010
    I see,thanks!