Jaa


Database Corruption Part 5 :: Dealing with Log File Corruption

Hi Friends,

I am back again after a long time. Last month was the end of yet another financial year, and all of us were very busy with the processes involved with each year-end. Now that all those are completed, I could find some time for my next blog post on database corruption.

Here, I will be targeting issues related to corruption in the database log files. As we all know, the Transaction Log File(s) of a database (*.ldf files) are essential for maintaining transactional consistency in the databases.

In my earlier blog post on Shrinking the Transaction Log files in SQL Server, I have already outlined the usage of the Transaction Log File(s) during normal operation of the database. Again, when a database starts up, it has to go through the recovery process. The recovery process consists of 3 distinct phases – analysis, redo and undo.

During the analysis phase, a background thread reads the Transaction Log File(s) of the database, from the last marked CheckPoint till the end of the file, and analyzes the records in the Transaction Log File(s). If this phase completes successfully, the recovery process moves to the next phase called the redo phase, and later into the undo phase.

A failure in any of these phases can cause the recovery process to fail; there by causing the database to be marked suspect (a database is marked suspect when the database cannot be recovered). More often than not, in case of Log File corruption, the recovery process fails during the analysis phase. Any failure during recovery is recorded in the SQL Server ErrorLog file, and the information in this file can give deep in site into the problem on hand.

Note: Log File corruption is just one of the reasons why recovery might fail for a database. Recovery might also fail when the mdf or the ndf files are corrupt. In these cases, recovery mostly fails during the redo/undo phases.

Next steps when a Log File is found to be corrupt:: Step 1

The first step that you should consider when you find the Log File to be corrupt, is to analyze the Windows Application and System Event Logs for any hardware issues. If you can spot any hardware issue, including any disk issue, please engage your hardware vendor immediately. If a database has become corrupt once due to a hardware issue, it will mostly likely become corrupt again if the hardware issue is not resolved.

Next steps when a Log File is found to be corrupt:: Step 2

The obvious next step is to restore the database from the last good known backup. Restore the last Full Backup, the last Differential Backup taken after the applied Full Backup, and all Transaction log Backups taken after the applied Differential Backup, in order. This will ensure that the transactional consistency of the database is maintained and the database is recovered to a known point in time.

However, if restoring the database from the last good known backup is not an option, then the only other option is to try and rebuild the Log File(s). Since there is no way in which we can repair a corrupt Transaction Log File, our best shot is to rebuild the Log File(s).

Before you rebuild the Transaction Log File, please go through the article When should you rebuild the transaction log? by the SQL Server Storage Engine Team.

To rebuild the log files, follow the following steps:

  1. Detach the database using the sp_detach_db command. You might need to alter the database and change it to emergency mode before you can detach the database. For details on how to change the database state to Emergency, please see the following MSDN article: https://msdn.microsoft.com/en-us/library/ms174269(SQL.90).aspx

    Syntax:

    ALTER DATABASE database_name
    {
    <set_database_options>
    }
    [;]

    <set_database_options>::=
    SET
    {
    { <optionspec> [ ,...n ] [ WITH <termination> ] }
    }

    <optionspec>::=
    {
    <db_state_option>
    }

    <db_state_option> ::=
    { ONLINE | OFFLINE | EMERGENCY }

  2. Now, delete or rename the corrupt Transaction Log File(s). You can, now, use the Create Database with the FOR ATTACH_REBUILD_LOG option to attach the database while rebuilding the Log File(s):

    Syntax:

    CREATE DATABASE database_name
    ON <filespec> [ ,...n ]
    FOR { ATTACH_REBUILD_LOG }
    [;]

    According to the Books OnLine (https://msdn.microsoft.com/en-us/library/ms176061(SQL.90).aspx):

    FOR ATTACH_REBUILD_LOG
    Specifies that the database is created by attaching an existing set of operating system files. This option is limited to read/write databases. If one or more transaction log files are missing, the log file is rebuilt. There must be a <filespec> entry specifying the primary file.

    Note:
    If the log files are available, the Database Engine will use those files instead of rebuilding the log files.

    FOR ATTACH_REBUILD_LOG requires the following:
    => A clean shutdown of the database.
    => All data files (MDF and NDF) must be available.

    Important:
    This operation breaks the log backup chain. We recommend that a full database backup be performed after the operation is completed.

Next steps when a Log File is found to be corrupt:: Step 3

If Step 2 fails, there is one more option that you can give a try. 

  1. You need to alter the database and change it to emergency mode. For details on how to change the database state to Emergency, please see the following MSDN article: https://msdn.microsoft.com/en-us/library/ms174269(SQL.90).aspx

    Syntax:

    ALTER DATABASE database_name
    {
    <set_database_options>
    }
    [;]

    <set_database_options>::=
    SET
    {
    { <optionspec> [ ,...n ] [ WITH <termination> ] }
    }

    <optionspec>::=
    {
    <db_state_option>
    }

    <db_state_option> ::=
    { ONLINE | OFFLINE | EMERGENCY }

  2. Now, execute DBCC CHECKDB with REPAIR_ALLOW_DATA_LOSS.

    Syntax:

     DBCC CHECKDB 
    [
        [  (  database_name | database_id |  0
            [  ,  NOINDEX 
            |  ,  { REPAIR_ALLOW_DATA_LOSS | REPAIR_FAST | REPAIR_REBUILD } ]
             )  ]
        [ WITH 
            {
                [ ALL_ERRORMSGS ]
                [  ,  EXTENDED_LOGICAL_CHECKS ] 
                [  ,  NO_INFOMSGS ]
                [  ,  TABLOCK ]
                [  ,  ESTIMATEONLY ]
                [  ,  { PHYSICAL_ONLY | DATA_PURITY } ]
            }
        ]
    ]
    

When a database has been set to emergency mode, DBCC CHECKDB can perform some special repairs on the database if the REPAIR_ALLOW_DATA_LOSS option is specified. This is a fully documented and supported way of repairing your transaction log. For details, please refer to the following:

From Books Online: DBCC CHECKDB (Transact-SQL)

From the SQL Server Storage Engine Blog Link: Ta da! Emergency mode repair

Disclaimer: All information provided here is my personal opinion and is neither verified nor approved by Microsoft before it is published. All information, and code samples, if any, is provided "AS IS" with no warranties and confers no rights.

Comments

  • Anonymous
    February 01, 2010
    Can you explain how you got these steps to work on SQL 2005? I removed the DBCC REBUILD_LOG command when I wrote the emergency mode repair functionality during SQL 2005 development. Also, doing an ATTACH_REBUILD_LOG only works if the database was cleanly shut down - not usually the case in corruption situations. Thanks

  • Anonymous
    February 20, 2010
    Hi Paul, You are correct about the command. It was my mistake and I have modified that part to reflect correct content. Also, regarding ATTACH_REBUILD_LOG, you are correct that it works only if the database was cleanly shutdown. I had mentioned that already in the post, and this time, I have it in BOLD, so that it is more clearly visible. Please let me know if you have any further suggestions.

  • Anonymous
    February 16, 2014
    Informative,please also include that repir may cause data loss ,you did it by giving Paul's link but if you add it in bold there it would be immediate caution for user

  • Anonymous
    May 08, 2015
    Awesome Article - Thank you !!

  • Anonymous
    February 08, 2016
    Hi We have a serieus database issue: The log scan number (1746899:8901:0) passed to log scan in database 'DesignLogistics_GreenLog' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup. We cannot make any backups and the disk is running out of space. We don't have an usefull backup because the problem is lasting for weeks now. Please can someone tell us what are our options are? Regards Ronald