SQL Server Troubleshooting: DBCC CHECKDB Reports Corruption. What's Next?
As SQL database administrator, you would have scheduled DBCC CHECKDB to execute as part of your database maintenance routine. It performs consistency checks against databases to look for corruption issues. This article will discuss in detail what you should do when running DBCC CHECKDB reports corruption in a SQL Server database.
DBCC CHECKDB Reports Corruption – What Should Be Your Next Step?
Before discussing how you should react to DBCC CHECKDB reporting database corruption, FIRST, understand what you should avoid to make things worse.
What Not to Do When DBCC CHECKDB Reports Corruption?
Don’t Restart or Reboot the Server
Restarting your SQL Server or rebooting Windows Server cannot fix database corruption. Performing any of these actions will instead put the database in offline mode and a SUSPECT mode. This will delay crash recovery on the corrupt database.
Don’t Detach and Re-attach the Database
Database recovery process will become much more complex if you’ll try to detach or re-attach a corrupt database.
Don’t Upgrade SQL Server
Upgrading your SQL Server to a new version cannot resolve the database corruption issue, instead you may have to face new hurdles of working with an updated version.
Don’t Run ‘REPAIR_ALLOW_DATA_LOSS’ Command Instantly
When DBCC CHECKDB reports corruption, use the REPAIR_ALLOW_DATA_LOSS as the last resort, as it can result in data loss. The REPAIR command should only be used when all other methods to troubleshoot database corruption fails.
Now that you know what to avoid when you detect SQL database corruption, let’s discuss the things you need to do when corruption occurs.
What to Do When DBCC CHECKDB Reports Corruption?
The instant DBCC CHECKDB reports consistency errors in a production database, stop using it to avoid further damage or data loss. Next, perform these two actions:
- Root-Cause Analysis of Database Corruption in SQL Server
- SQL Database Recovery
Let’s discuss these actions in detail.
Root Cause Analysis of Database Corruption in SQL Server
Before trying to fix database corruption, you must first uncover the cause of corruption to prevent it from reoccurring in the future. Essentially, performing root cause analysis of database corruption can help reduce or eliminate chances of database corruption in SQL Server.
Causes of SQL Database Corruption
I/O Subsystem Failure
A SQL database can turn corrupt due to several reasons, but primarily, it happens due to a problem with Disk Input-Output (I/O) Subsystems like disk controllers, disk drivers, firmware, etc. Here’s a good read by Paul Randal on how to determine if an IO subsystem is causing corruption.
Bugs in SQL Server
Software (SQL Server) bugs is another common cause of corruption in a SQL database. When you encounter a corruption error message, type the message text on Google search to see if you can find any Microsoft Knowledge Base (KB) article. Reading the KB article will help you understand the root cause behind corruption and how it can be fixed.
Other factors that may cause SQL database corruption are as follows:
- Ransomware attack
- Server crash due to abrupt system shutdown or disk space problems
- Upgrading to SQL Server to earlier versions
- Accidentally updating or deleting data inaccurately
- Run diagnostics to check for a problem with the existing hardware. Look for diagnostic tools provided by the hardware manufacturers.
- Check SQL Server error log or Windows event log for information that explains the errors reported by DBCC CHECKDB.
- Contact your vendors to check for the latest driver or firmware updates
- Check for CUMULATIVE updates or Service Pack available from Microsoft to fix SQL Server bugs. For instance, Microsoft released updates to fix online index rebuild causing index corruption. Also, see KB4019701.
- Try using a utility such as SQLIOSim on the drive on which databases – that have reported consistency errors – are stored. It is an independent tool that helps check integrity of the disk I/O subsystem. Starting with SQL Server 2008, you do not need to separately download the SQLIOSim tool, as it comes installed with SQL Server 2008.
How to Perform Root Cause Analysis of SQL Database Corruption?
If database corruption is caused due to hardware or software bugs, correct the hardware or software issues and see if it resolves corruption. If not, you will need to restore the database or repair it using the solutions in the next section.
SQL Database Recovery
Restore from Backup
The first and most recommended solution to recover data from a corrupted SQL database is to restore it from the most recent backup. You can recover complete data by restoring the updated Full database backup, followed by an updated Differential backup.
If the backup is not up to date or is corrupted, try the next solution.
Run DBCC CHECKDB with Repair Option
Check the repair options that running the DBCC CHECKDB command recommends to fix the issue. For instance, below is a message that recommends using “repair_allow_data_loss as the minimum level of repair” to deal with corruption:
CHECKDB found 0 allocation errors and 15 consistency errors in database 'dbname'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (dbname).
To use DBCC CHECKDB with the ‘REPAIR_ALLOW_DATA_LOSS’ option, you need to put the database in single-user mode. To do so, execute the following query:
ALTER DATABASE DB_Name
SET single_user WITH ROLLBACK IMMEDIATE;
GO
DBCC CHECKDB (DB_Name, repair_allow_data_loss);
GO
Use a Specialized SQL Recovery Tool
If the above methods don’t work for you, try using a SQL repair tool for MS SQL that is purpose-built to perform database recovery in a few simple steps. The tool repairs corrupt MDF/NDF database files and recovers all the objects while preserving the data integrity. It's beyond the scope of this article to provide a detailed description of 3rd party tools. But it's fairly easy to find them on the internet.
Preventive Measures to Avoid SQL Database Corruption
Following are a few factors you should consider to prevent corruption from occurring again:
- Run DBCC CHECKDB regularly
- Take database backup frequently
- Turn on Page Verification
- Set up alerts for SQL errors like 823, 824, and 825 to identify hard and soft I/O issues
- Consider using High Availability (HA) or Disaster Recovery (DR) solutions to get back an inaccessible database to an operational state with minimal or no downtime
Conclusion
This article discussed how you should react to database corruption reported by DBCCC CHECKDB. First, you must avoid doing things like restarting the server, detaching/re-attaching the database, upgrading SQL Server, and running repair instantly to make the situation worse.
Next, understand the underlying cause of the problem. You can perform root cause analysis of SQL database corruption by contacting a hardware manufacturer to fix hardware problems, check the SQL Server error log or Windows event log, and follow other recommendations discussed in this article.
Once you've identified the possible reason behind corruption, implement the solution based on what caused the database corruption. Ideally, before trying to restore or repair the database, fix any hardware or software errors. If that doesn't work, restore the database from the most recent backup. If the backup is not available or damaged, run the DBCC CHECKDB recommended repair option. If it fails, consider using a professional SQL recovery tool.
Once you've repaired the database, make sure to follow the preventive measures discussed in this post to avoid corruption from happening again.