Udostępnij za pośrednictwem


Database Corruption Part 1 :: Introduction to Database Corruption in SQL Server

Hello Everybody,

I think you will all agree with me that database corruption is one of the nightmares in the life of a DBA. We store precious data in databases and each day the dataset grows bigger and bigger. A database corruption issue not only puts the data at risk, it also threatens to take a toll on businesses and revenues.

Throughout my career in Microsoft, I have seen many corruption issues; and while working with the DBAs for troubleshooting such corruption issues, I have felt the apprehension and anxiety within their minds. Hence, thought of putting down some posts related to database corruption. In my next few posts, I will be discussing a few common corruption scenarios and how to repair the databases in each of these scenarios or how to salvage maximum data off corrupted databases. However, in this post, I will mainly target the most common causes of database corruption and how to prevent database corruption in the first place. Additionally, I would be discussing various best practices to follow, so that we have a fallback plan just in case data cannot be recovered after a corruption.

Before I begin my discussion, I would like all my readers to understand that although Microsoft has written the SQL Server product, Microsoft does not guarantee that databases will never become corrupt or in case of corruption, 100% data can be recovered. However, I would like to assure you that throughout the code of SQL Server, Microsoft has taken utmost care to safeguard your data as much as possible.

Moreover, I would also like my readers to also understand that Microsoft is not a Data Recovery Company. Meaning, Microsoft Product Support Services (PSS) does not guarantee that if you call in with a database corruption issue, PSS would recover all your data. All support that PSS provides in corruption cases is on "best efforts basis", meaning that PSS will provide commercially reasonable efforts to recover your database or data off your corrupted database using documented and undocumented commands and procedures. However, 100% data recovery is not guaranteed.

Before I go any further, I would request my readers to go through the following articles:
=> Overview of the Microsoft third-party storage software solutions support policy
=> Microsoft SQL Server Database Engine Input/Output Requirements
=> Microsoft does not certify that third-party products will work with Microsoft SQL Server

The most common cause of database corruption (more than 95% of all corruption cases) that we in PSS encounter turn out to be caused by a platform issue, which is a layer below the SQL Server. The most common individual cause is a 3rd party driver or firmware bug. The next most common cause is an actual hardware fault and are typically either of Disk, Controller, CPU, or Memory Module(s). As of all Database Management Systems, SQL Server heavily relies on the Disk Subsystem for storing and retrieving data. Any issues in the Disk Subsystem, the Disk Controllers or the Communication Channels or even the Disk Drivers can cause databases to become corrupt.

If you encounter any corruption issue, please engage your hardware vendor(s) immediately. The hardware vendor(s) should primarily check the driver(s), firmware(s), and BIOS versions to ensure that they are running the most current and recommended versions. Additionally, they should run available hardware diagnostic tools to identify potential hardware issues. Each vendor typically has their own in-house utilities designed to detect obvious hardware problems. While OEM diagnostic utilities may be able to detect an obvious hardware failure, they are far from conclusive. In our experience it is common for these diagnostic utilities to produce a clean bill of health for a system that is later found to have a hardware problem.

The Windows Event Logs should also be constantly monitored for any Disk / Hardware related errors. It is not necessary for a corruption to be introduced or detected as soon as a Hardware Error is reported. A corruption case can be detected even months after a Hardware Error is reported. Consider a scenario when a disk suddenly became unresponsive when SQL Server was writing into it. This will obviously cause the data to become corrupt; however, this corruption will be detected only when we try to access the pages that had become corrupt; and this can be months after the corruption was actually introduced. Similarly, a bit flip while writing data to the disk subsystem, caused by a disk driver, can cause corruption, and this might be actually detected at a much later point in time.

Kernel Drivers and softwares like Antivirus, that have the ability to take Kernel Level Handle on files, are also known to cause database corruption issues. In case you have Antivirus Installed on the system, please ensure that you exclude the SQL Server Database Files from Antivirus scans. Please refer to Guidelines for choosing antivirus software to run on the computers that are running SQL Server.

Although very rare, we have seen a few people keep database files in compressed volumes and folders so as to conserve disk space. Having database files in compressed volumes and folders are not supported and can cause database corruption as well. Please avoid storing database files in compressed volumes and folders. In case you are concerned about disk space and would like to compress database files, please make use of data compression, introduced in SQL Server 2008. Additionally, please do not store your backups in compressed volumes and folders. Please utilize Backup Compression introduced in SQL Server 2008.

One last addition to the list of probable causes of database corruption includes Bugs with Microsoft products itself. PSS is always on the lookout and will always be on the lookout for evidence of Microsoft Bugs that can cause databases to become corrupt; and all such identified Bugs are fixed as soon as possible, so that our larger customer base does not get affected by those Bugs. However, it is always recommended to be on the latest builds of SQL Server and Windows to avoid database corruption caused by Microsoft Bugs. Specifically speaking, for each of the products listed below, please upgrade to the mentioned builds:
=> SQL Server 2000: SP4 and QFE build 2245
=> SQL Server 2005: SP2 or SP3 (SP3 recommended)
=> Windows XP: SP2
=> Windows 2000: SP4 + QFE KB 838647
=> Windows 2003: SP1 + QFE KB 940467
=> MSXML with SQL 2000: If MSXML is being used, remove /3GB in BOOT.INI or disable full MSXML garbage collection (see KB 321621)

If you would like to contact Microsoft PSS for a Root Cause analysis for your corruption issues, please document the answers to the following questions:
=> When did corruption first occur?
=> Has the corruption reappeared?
=> If it has, what is the interval in which it has reappeared.
=>  Do you see any pattern in the reoccurrences?
=> Are multiple databases affected?
=> Are multiple tables within the same database affected?

Additionally, please share the following data:
=> The Windows Eventlogs exported in TEXT format.
=> The SQL Server Errorlogs.
=> The MSINFO32 output exported in TEXT format.
=> Complete hardware specifications. This should include SAN details in addition to the server’s hardware.
=> DBCC CHECKDB output showing the corruption (if available).
=> Get the output from the msdb..suspect_pages table.
For more details on suspect_pages please refer https://msdn.microsoft.com/en-us/library/ms174425(SQL.90).aspx.

Also, please be prepared to run the SQL Server MPSReports. This utility would be shared by Microsoft PSS once you are in contact with Microsoft PSS.

Finally, before I conclude on this post, I would like to request all my readers to ensure that you take regular backups of the SQL Server Database onto Tapes or other removable media. Since we are relying on hardware for storing data, and hardware is prone to failure, we must have a backup of all the data, so that we can revert back in case of hardware failures. Please take regular backups of all your databases, including the system databases. Backups might include Full Database backups, Differential Database Backups, File and FileGroup Backups and Transaction Log Backups. Microsoft has provided various types of backups in SQL Server, please make use of all of these. I would request all my readers to finalize and implement a backup strategy as soon as possible, if you have not already done so. This would help you to minimize data loss in case of corruption or failure. Moreover, after taking a backup, please ensure that you actually restore the backup on a test system to ensure the validity of the backup and that it can be used in case of database failure.

Additionally, there are various High Availability Solutions introduced by Microsoft, including, but not limited to, Log Shipping, Database Mirroring and Replication; please make use of these High Availability Solutions, to ensure that you have a duplicate database ready in case you encounter a database corruption issue or a database failure issue.

Hope this post is useful. Any comments will be highly appreciated.

Disclaimer: All information provided here is my personal opinion and is neither verified nor approved by Microsoft before it is published. All information, and code samples, if any, is provided "AS IS" with no warranties and confers no rights.

Comments

  • Anonymous
    April 08, 2009
    PingBack from http://asp-net-hosting.simplynetdev.com/introduction-to-database-corruption-in-sql-server/

  • Anonymous
    February 25, 2010
    Hello Suhas, Please excuse me for taking a few minutes of your valuable time. I am Learner and work in Tech Support for certain SQL Based Software (we use MS SQL to externally save / record the data collected from the devices in a Network), and My Question is like this : I noticed a Single Data Corruption issue in my MS SQL database. That means, the concerned device recorded the correct vales for the said time stamp in it's internal memory. However, the MS SQL Data Base Values of identical data for the same time interval was wrong. The recorded values were extremely larger than what the device recorded. We use MSDE2000 and can you provide me some insight to identify the problem ?

  • Anonymous
    February 26, 2010
    Hi Senaka, I need more information to help you. Suggest you to connect with me offline using the EMAIL link on top. Also, please send me a more detailed description of the issue you are facing; I will try to help you as much as I can. Suhas

  • Anonymous
    May 19, 2014
    SQL database repair tool is amazing software which has been developed to fix all the damaged database of SQL server. To offer you the best possible SQL recovery result, this tool is equipped with powerful QFSCI algorithms which helps in easy recovery of every bit of SQL database. The software repair tables, views, triggers, stored procedures, damaged unique keys, primary keys and foreign keys efficiently. Read More: pcrecoverytools(dot)com/sql-database-recovery(dot)html

  • Anonymous
    June 26, 2014
    Try 3rd party tool and recover your all database file and  fix the errors. For Free download click sqldatabserecovery.weebly.com sqldatabaserecoverytool.blogspot.com

  • Anonymous
    June 29, 2014
    Now available free SQL Server repair tool with trial version and give you effective tool service. Visit here and read more: http://www.sqlrepair.org

  • Anonymous
    July 07, 2014
    SQL database repair software can be used to recover data from corrupt SQL server database when backup is not available or too-old. Download free demo version from below link: download.cnet.com/.../3000-2065_4-10844740.html

  • Anonymous
    July 28, 2014
    I found third party recovery tool on Google www.solutions.sqlrepairtool.net/sql-database-recovery , this software can help retrieve SQL data.

  • Anonymous
    October 20, 2014
    can I contact you to get more infor in this?

  • Anonymous
    October 29, 2014
    You can also try another excellent tool, which is well known SQL recovery software that recovers and repairs your corrupted or inaccessible SQL database without hassle. For more details about this tool please see this site:-  www. pc recovery utility.com/sql- database-recovery

  • Anonymous
    December 23, 2015
    My database is more than 250GB and I have a great experience with http://sqlbackupandftp.com/ Scheduled backups work perfectly!

  • Anonymous
    May 09, 2016
    What about corrupt, unreadable DB backup files. Apparently, the copy from the DB server to the archive server corrupted the file. I have tried SysTools SQL Recovery and it works successfully in mu case. see here: https://www.systoolsgroup.com/sql-recovery.html

    • Anonymous
      June 27, 2016
      Well, we aren't talking about file corruption here. Database backup files, like any other file on the filesystem, can get corrupt due to many reasons including problems with the disk sub-system. This blog is about database corruption instead.