CHECKDB (Part 2): What does CHECKDB really do? (1 of 4)
Hmmm - I sat for 5 minutes thinking of something amusing to say to start this one off and nothing came to mind, so I'm afraid this will be a humor-free post. Maybe I'm jet-lagged from being on the East coast all last week.
As with all things related to DBCC, this topic has its share of misinformation. In this post I'll set the record straight by running through all the stages of CHECKDB in SQL Server 2000 and 2005. I'll need to split this up into seperate posts otherwise I'll be writing a book. I also introduce a whole raft of new terms which will also be subjects for future posts (my list is already getting pretty long!)
So the very first thing it does is work out how to get the transactionally consistent view it requires (see CHECKDB Part 1) and then, if needed, either record the relevant LSN and switch to full-logging (for SQL Server 2000) or create a database snapshot (for SQL Server 2005).
Then it runs through the checks in the order shown below:
1. Primitive checks of critical system tables
This stage is in SQL Server 2000 and 2005. First of all, what are critical system tables? These are the system tables that hold Storage Engine metadata. Without these we'd have no idea where any data was stored in the database files or how to interpret records.
In SQL Server 2000, the critical system tables are:
- sysindexes
- sysobjects
- syscolumns
These tables have to be checked first because we use the metadata they contain to access all the other tables and indexes in the database. These tables are freely queryable so poke about and see what's stored in there.
In SQL Server 2005, the metadata layer has been rewritten and the critical system tables are:
- sysallocunits
- syshobts
- syshobtcolumns
- sysrowsets
- sysrowsetcolumns
More on allocation units, hobts, and rowsets later in the week - for now you can assume they serve the same function as the three critical system tables in SQL Server 2000. You can't see these system tables because they're 'hidden' - the parser won't allow them to be bound to in a query. Try running 'select * from sysallocunits' to see what I mean.
The primitive checks are designed to check that internal queries on the metadata tables won't throw errors. Each of the critical system tables has a clustered index. The primitive checks just check the leaf-level data pages of the clustered indexes. For every one of these pages, the following is done:
- Read and latch the page (a latch is a lightweight internal version of a lock). This makes sure that there aren't any IO problems with the page such as a torn-page or bad page checksum and ensures that we can put the page in the buffer pool correctly. This is the most common cause of failure of the primitive system table checks and results in error 8966, which in SQL Server 2000 could look something like:
Server: Msg 8966, Level 16, State 1, Line 1
Could not read and latch page (1:33245) with latch type SH. sysobjects failed.
- Audit the page. This is a series of checks of the page structures which I'll cover in a seperate post. If these pass, the page looks like a SQL Server page of type its supposed to be.
- Check the basic page linkage. Pages in each level of a clustered index are linked together in a doubly-linked list to allow range scans to work. At this stage we only check the left-to-right linkage to ensure the linked-to page actually exists.
- Check the page linkage for loops. This is simple to do - have two pointers into the page linked-list with one advancing at every step and one advancing at every second step. If they ever point to the same thing before the faster-advancing pointer reaches the right-hand side of the leaf level then there's a loop. Its important that there are no linkage loops otherwise a range scan may turn into an infinite loop. I've never seen this occur in the field.
Any error found at this stage cannot be repaired so you must restore from a backup. This is because the repair would have to deallocate the page, effectively deleting metadata for a whole lot of tables and indexes. As people's databases get larger and more complex (thousands of tables and indexes), the percentage of pages that comprise these critical system tables rises and so the chance of a hardware problem corrupting one of these pages also rises - I see several of these a month on the forums. Without a backup, the only alternative is to try to export as much data as you can - not good.
If all the pages are ok then we know we've got solid enough metadata on which to base the next set of checks.
2. Allocation checks
(Part 2...)
3. Logical checks of critical system tables
4. Logical checks of all tables
(Part 3...)
5. Service Broker checks
6. Metadata cross-checks
7. Indexed view and XML index checks
(Part 4...)
Comments
Anonymous
July 05, 2006
Why does CHECKDB not check the statistics?
Not long ago we had a table in a database with an index on it. If you read the table by opening a cursor (sp_cursoropen) you got an sqldump. CHECKDB said everything is OK. Microsoft Support explaint that there is an corrupted statistic on the accesed index. A DBCC DBReindex or Update Statistics will solve the problem. Wouldn't it be nice if CHECKDB or any other check syntax, like DBCC CheckStatistcs, would recognize the error?Anonymous
July 06, 2006
The comment has been removedAnonymous
October 30, 2006
PingBack from http://www.julian-kuiters.id.au/article.php/sqlserver2005-inside-dbccAnonymous
February 07, 2007
PingBack from https://blogs.msdn.com:443/sqlserverstorageengine/archive/2007/02/04/checkdb-part-8-did-repair-fix-everything.aspxAnonymous
August 13, 2007
Long ago, not long ago.... No doubt that many of you might have gone through the error above within yourAnonymous
February 18, 2009
PingBack from http://www.glorf.it/blog/2006/07/03/sql-talk/sql-server-storage-engineAnonymous
May 14, 2009
In this post, we would have a look at the differences between DBCC CHECKTABLE and DBCC CHECKDB. We wouldAnonymous
June 09, 2009
PingBack from http://toenailfungusite.info/story.php?id=4868Anonymous
June 13, 2009
PingBack from http://thestoragebench.info/story.php?id=10154Anonymous
June 17, 2009
PingBack from http://patiosetsite.info/story.php?id=990Anonymous
June 18, 2009
PingBack from http://adirondackchairshub.info/story.php?id=3783Anonymous
January 25, 2014
Hi Paul, Quick one, does checkdb use space? and if it does, does it get released once it is done? Thanks