Example corrupt database to play with and some backup/restore things to try
I've been asked several times over the last few weeks for an example corrupt database to play with, and for testing logic built around DBCC CHECKDB.
The attached WinZip file contains a backup of a simple 2005 database called 'broken' (I can do a 2000 one too if there's enough demand). It has a simple table called 'brokentable' (c1 int, c2 varchar(7000)) with one row in it. The table has a single data page with page ID (1:143) that I've corrupted so the page checksum is corrupt.
This means you can try all sorts of cool things. Below I've listed a few things you can try out to see what would happen on your database if a checksum failed.
Restore
I didn't do anything special when backing up this database so restoring works just fine, even though it contains a corrupt page.
RESTORE DATABASE broken FROM DISK='c:\broken.bck'
WITH
MOVE 'broken' TO 'c:\broken.mdf',
MOVE
'broken_log' TO 'c:\broken_log.ldf';
GO
Processed 160 pages for database 'broken', file 'broken' on file 1.
Processed 2 pages for database 'broken', file 'broken_log' on file 1.
RESTORE DATABASE successfully processed 162 pages in 0.314 seconds (4.203 MB/sec).
DBCC CHECKDB
CHECKDB throws us some nice errors:
DBCC
CHECKDB ('broken') WITH NO_INFOMSGS, ALL_ERRORMSGS;
GO
Msg 8928, Level 16, State 1, Line 1
Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data): Page (1:143) could not be processed. See other errors for details.
Msg 8939, Level 16, State 98, Line 1
Table error: Object ID 2073058421, index ID 0, partition ID 72057594038321152, alloc unit ID 72057594042318848 (type In-row data), page (1:143). Test (IS_OFF (BUF_IOERR, pBUF->bstat)) failed. Values are 12716041 and -4.
CHECKDB found 0 allocation errors and 2 consistency errors in table 'brokentable' (object ID 2073058421).
CHECKDB found 0 allocation errors and 2 consistency errors in database 'broken'.
repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (broken).
Query errors
Any query that touches that page is going to fail with an 824 error:
SELECT
* FROM broken..brokentable;
GO
Msg 824, Level 24, State 2, Line 1
SQL Server detected a logical consistency-based I/O error: incorrect checksum (expected: 0x7232c940; actual: 0x720e4940). It occurred during a read of page (1:143) in database ID 8 at offset 0x0000000011e000 in file 'c:\broken.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Backup with CHECKSUM
If you have page checksums turned on, you should always use the WITH CHECKSUM option on backups to validate the page checksums as they are read:
BACKUP
DATABASE broken TO DISK='c:\broken2.bck'
WITH
CHECKSUM;
GO
Msg 3043, Level 16, State 1, Line 1
BACKUP 'broken' detected an error on page (1:143) in file 'c:\broken.mdf'.
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.
But we can force it to backup. If this is the only copy of the database we have, and we're being forced to run repair, for instance, then we want to make sure we have a backup to restore from. Even a backup that contains a corrupt database is better than no database at all. In this case, we need to use the CONTINUE_AFTER_ERROR option.
BACKUP
DATABASE broken TO DISK='c:\broken2.bck'
WITH
CHECKSUM, CONTINUE_AFTER_ERROR;
GO
Processed 160 pages for database 'broken', file 'broken' on file 1.
Processed 1 pages for database 'broken', file 'broken_log' on file 1.
BACKUP WITH CONTINUE_AFTER_ERROR successfully generated a backup of the damaged database. Refer to the SQL Server error log for information about the errors that were encountered.
BACKUP DATABASE successfully processed 161 pages in 2.025 seconds (0.651 MB/sec).
RESTORE VERIFYONLY
Now that we have a backup that has checksum information in it, let's see how we can check whether the backup is valid:
RESTORE
VERIFYONLY FROM DISK='c:\broken2.bck';
GO
The backup set was written with damaged data by a BACKUP WITH CONTINUE_AFTER_ERROR.
Isn't that cool? It tells us that the backup was already corrupt when it was written. Ok - let's ask it to specifically check the checksums in the backup:
RESTORE
VERIFYONLY FROM DISK='c:\broken2.bck'
WITH
CHECKSUM;
GO
The backup set was written with damaged data by a BACKUP WITH CONTINUE_AFTER_ERROR.
Same thing. What about if we try to check the checksums on the initial backup?
RESTORE
VERIFYONLY FROM DISK='c:\broken.bck'
WITH
CHECKSUM;
GO
Msg 3187, Level 16, State 1, Line 1
RESTORE WITH CHECKSUM cannot be specified because the backup set does not contain checksum information.
Msg 3013, Level 16, State 1, Line 1
VERIFY DATABASE is terminating abnormally.
RESTORE
How about we try to overwrite the existing 'broken' database with the one from the second backup that has checksum information in it?
RESTORE
DATABASE broken FROM DISK='c:\broken2.bck'
WITH
REPLACE;
GO
Msg 3183, Level 16, State 1, Line 1
RESTORE detected an error on page (1:143) in database "broken" as read from the backup set.
Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.
It won't let us because the backup contains corrupt data. But we can force it if we need to with CONTINUE_AFTER_ERROR again.
RESTORE
DATABASE broken FROM DISK='c:\broken2.bck'
WITH
REPLACE, CONTINUE_AFTER_ERROR;
GO
Processed 160 pages for database 'broken', file 'broken' on file 1.
Processed 1 pages for database 'broken', file 'broken_log' on file 1.
The backup set was written with damaged data by a BACKUP WITH CONTINUE_AFTER_ERROR.
RESTORE WITH CONTINUE_AFTER_ERROR was successful but some damage was encountered. Inconsistencies in the database are possible.
RESTORE DATABASE successfully processed 161 pages in 0.392 seconds (3.364 MB/sec).
Isn't that cool? It works BUT it tells us that the backup set contained corrupt data and that the database was restored but could have corrupt data in.
Summary
Have a play about with this database to familiarize yourself with the kind of responses you'll get from the various tools when a corruption exists, and how to work around it if need be. Let me know if you want to see any particular kinds of corruptions explored, or want a database with something specific corrupted in.
Ok - this turned into a longer post than I originally planned :-)
(PS - about 20 people have offered to fill in my VLDB maintenance survey so far - would be great to have some more. The DVD I'm giving as a reward has a bunch of great hands-on labs, including one that does cool backup/restore stuff to work around disasters. See the previous few posts for details. Thanks.)
Comments
Anonymous
April 17, 2007
Hoy en cosas interesantes: Como te sientes cuando te hacen MVP :-), otro ridiculo documento sobre linux,Anonymous
April 17, 2007
Great post, thanks. If you could post a SQL 2000 corrupt database that would be great.Anonymous
April 18, 2007
Yes, a SQL 2000 database would be very helpful [it could be used for either SQL 2000 or 2005].Anonymous
April 20, 2007
As promised in my earlier post of an example corrupt 2005 database, here's one I've just created forAnonymous
April 30, 2007
One evening last week I sat down with Kimberly for 5 minutes to come up with a top-of-our heads listAnonymous
April 30, 2007
Quick Links Security - Platforms - Internet - SQL Server - Exchange - Manageability Security News...Anonymous
May 05, 2007
Have you ever wondered you can recover a corrupted database when you have optimum backup & restoreAnonymous
June 02, 2007
First blog post of the year from TechEd! Well, Kimberly and I arrived a day early hoping to chill outAnonymous
August 09, 2007
转载、解释一下来自SQL Server Storage Team的VLDB维护建议清单。 虽然主题是维护,但相信理解这些对于从事数据库开发的人员来说也有一定价值。Anonymous
October 06, 2007
I was developing course materials for a workshop recently and I wanted to run a lab where you DBCC’d