Ghost "Rows" Buster in SQL Server
Introduction
This article simulates the behavior table with the occurrence of Ghost Records, as the problem appears and how you can stop this failure.
The process GhostRecordCleanUp (to cleaning Ghost record) runs regularly and purges the remaining records of the database files (MDF and NDF) that were marked for deletion in the Log file (LDF), but due to one or more failures were not deleted physically.
This process occurs transparently in SQL Server, but this article will demonstrate all the steps of this implementation.
It's important to clarify this situation where at least one record has been marked for deletion, but is still available in the database for all query operations and it may generate incorrect numbers in many critical applications and errors on data consistency.
This creates a serious problem in applications where accuracy of information is crucial for big business decisions. The isolation level can also allow the Ghosts Records appear to other users, but this is part of another article.
Let's "hunt" Ghosts Records!
Note
The article uses fn_dblog ,DBCC PAGE and DBCC FORCEGHOSTCLEANUP() command which are undocumented commands . Undocumented means Microsoft does not support these commands, changes can be made to these commands without giving any information to user so unexpected result might occur if run different times. So I advise readers not to use these commands on production database. You can play around with these commands on Test databases but on your own risk.
How "Ghost" appears ** **
The Ghost record can appear in three situations:
- When running DELETE commands, or
- When running INSERT commands, or
- When running DELETE and INSERT commands in different queries, but related with the same indexed data
The count of Ghost records displays the number of rows into an index or *heap *(table without Clustered Index) where one or more records can be marked for deletion in the database, but has not been deleted by the Database Engine. Actually, records aren't deleted from the Database immediately, they are just marked as deleted in your respective Log file, this is called logical deletion and improves the performance of executing commands with high disk I/O.
Over this time, the row can be viewed and updated on various queries and some Ghost Records can be manipulated with Transact-SQL using UPDATE or DELETE commands, without any user knowing.
In INSERT commands, it's possible to have more than one "version" row for the same index. This can occur if a new row has your indexed fields with identical information to another row that was marked as deleted (except when it runs the TRUNCATE TABLE command), but not yet completed this process to effectively delete row from files physically in the Database.
Building the Environment for Testing ** **
So we can simulate the problem using a table with two types of indexes: a Clustered (PRIMARY KEY) and other Non-Clustered for the same field, which in this sample will be the "ID_EXAMPLE" field.
--CREATING A TABLE FOR DEMO
CREATE TABLE dbo.TB_EXAMPLE(
ID_EXAMPLE int NOT NULL IDENTITY(1,1),
NM_EXAMPLE varchar(25) NOT NULL,
DT_CREATE datetime NULL DEFAULT(GETDATE())
);
GO
--CREATING 2 INDEXES USING THE SAME FIELD, TO IDENTIFY THE BEHAVIOUR IN EACH SAMPLE
CREATE CLUSTERED INDEX PK_EXAMPLE ON dbo.TB_EXAMPLE (ID_EXAMPLE ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
CREATE NONCLUSTERED INDEX IX_EXAMPLE ON dbo.TB_EXAMPLE (ID_EXAMPLE ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
See this output SQL script in the image below.
Let's insert some rows so everyone can understand and simulate this process in your SQL Server.
--INSERTING 1000 DIFFERENT ROWS IN TABLE
INSERT INTO TB_EXAMPLE (NM_EXAMPLE) VALUES ('GHOST ITEM ' + CONVERT(VARCHAR,ISNULL(@@IDENTITY, 0)))
GO 1000
--VIEWING THE 1000 ROWS
SELECT * FROM TB_EXAMPLE
GO
**See this output SQL script in the image below.
**
So we can see how the process occurs, we have to change behavior settings of SQL SERVER, in TRACE 3604 (redirects the output to the Log into SSMS) and TRACE 661 (controls the execution of the removal process of Ghost Record).
WARNING: Don't make this change in the "Production Servers". These procedures are undocumented in the BOL.
--4. Enable TRACE "3604" to view the DBCC commands in SSMS (SQL Server Management Studio)
DBCC TRACEON (3604, -1)
GO
--5. Enable TRACE "661" ("GHOST RECORD CLEANUP" Task)
--to view the records marked for deletion on each page ("rows per index")
DBCC TRACEON (661, -1)
GO
See this output SQL script in the image below.
Identifying the Ghost Record ** **
We will run the deletion of some records for this demonstration without any specific criteria. I will delete out first 700 records.
Soon to follow, we will get what the last transaction performed and specify the read of data that have been marked for alteration or, in this sample for exclusion.
This prevents dirty reads of other queries. When this happens the data can be viewed and changed to other transactions, which can modify the content we want to analyze.
So we can begin to identify the Ghosts Records, we using db_fnlog function that performs the sequential reading Log.
We can see in the image below, after execution of the DELETE command, all the deleted rows are marked in "Operation" field with the value "LOP_DELETE_ROWS" and, remain available for any other Transact-SQL command (SELECT, INSERT, UPDATE or DELETE) with the marking of "Context" field with "LCX_MARK_AS_GHOST" value.
The records will always be marked as Ghosts to be permanently excluded in the Database.
--6. Deletes 700 records in a table and the Ghosts Records appear
DELETE TOP(700) FROM TB_EXAMPLE;
--The records are marked for deletion, but were not deleted
DECLARE @TRAN_ID CHAR(20)
SELECT @TRAN_ID = [Transaction ID] FROM fn_dblog(null, null)
SELECT * FROM fn_dblog (null, null) WHERE [Transaction ID] = @TRAN_ID;
GO
See this output SQL script in the image below.
The query below will facilitate our follow this delete operation, displaying the unchanged records number and records number that are marked to be deleted in the "TB_EXAMPLE" table, but we still waiting for their physical deletion in the Database.
We can track data change in a table using a VIEW system: sys.dm_db_index_physical_stats.
The fields that will help us in identifying these data are:
- The RECORD_COUNT field displays the count of records that were not affected in the table
- The GHOST_RECORD_COUNT field is the count of Ghost Records awaiting their deletion by cleanup task(GHOST CLEANUP RECORD) in each index in this table,and;
- The INDEX_TYPE_DESC field indicates where the records are stored, the index affected.
--7. Displays the current state of the rows into "TB_EXAMPLE" table of "WI_Infra" Database
SELECT RECORD_COUNT, GHOST_RECORD_COUNT, INDEX_TYPE_DESC
FROM sys.dm_db_index_physical_stats (DB_ID(N'WI_Infra'), object_id('TB_EXAMPLE'), NULL, NULL , 'DETAILED')
WHERE INDEX_LEVEL = 0
GO
See this output SQL script in the image below and the indexes: PK_EXAMPLE (clustered index) and IX_EXAMPLE(nonclustered index).
We have the same amount of records simply because we have adopted a common data deletion for both indexes, but will the Ghosts Records will maintain this data integrity? We'll see later in this article.
Below we will use a DBCC command (DataBase Consistency Checker) called PAGE, which is documented in a KB. So, we can obtain the data that is stored in each structure of SQL Server page.
Shortly after the execution, we can see content on the allocation of records per page. Let us now focus our statement in the range of the field called "Field" in the logical file #1, between 1321 and 1327 pages. In this range we have an indication that there Ghosts Records, marked with the text "Has Ghost" in the description of "VALUE" field.
--8. The "Ghost Cleanup" task scans all pages on "WI_Infra" Database, searching for Ghosts Records
--Effecting the dump of paging, can see that the Ghost Records are marked with "Has Ghost" text
DBCC PAGE('WI_Infra',1,1,3) WITH TABLERESULTS
GO
See this output SQL script in the image below.
Detailing the page 1327, in logical file # 1, we can see that we really have an allocation Ghost Record, where the "m_ghostRecCnt" parameter is true.
--9. Dumping the details of a particular record marked as Ghost into "WI_Infra" Database, we can see
--your reference in the page and "m_ghostRecCnt" flag indicating this field as a Ghost
DBCC PAGE('WI_Infra',1,1327,3) WITH TABLERESULTS
GO
See this output SQL script in the image below.
Ending the Ghost Record
To wipe out the Ghosts Records, we run the same command executed automatically by SQL Server and that we disabling via TRACE 661.
Let's run the "Ghost Cleanup Rows" task to execute all pending issues and to purge Ghosts Records failures. We highlight below in the logical file #1, the range pages 1321 until 1327, where SQL Server identifies the previous limits to page 1320 and next limit on page 1328.
--10. Runs the "Ghost Cleanup" task in the records marked as ghost
DBCC ForceGhostCleanup;
GO
See this output SQL script in the image below.
After manually perform routine of SQL Server task, we can see the result below:
-- 11. Checking if still exist the Ghosts Records into "TB_EXAMPLE" table of "WI_Infra" Database
SELECT RECORD_COUNT, GHOST_RECORD_COUNT, INDEX_TYPE_DESC
FROM sys.dm_db_index_physical_stats (DB_ID(N'WI_Infra'), object_id('TB_EXAMPLE'), NULL, NULL , 'DETAILED')
WHERE INDEX_LEVEL = 0
GO
See this output SQL script in the image below.
It may seem unexpected, but this is a common situation, one left Ghost Record that was not purged. In this case, the clustered index will become fragmented because the reference record is lost.
The solution is simple, just run a REBUILD index and this Ghost Record is deallocated of the Database.
--12. One left Ghost Record, running a forced correction with REBUILD index
ALTER INDEX PK_EXAMPLE ON dbo.TB_EXAMPLE REBUILD;
GO
-- 13. Checking if the Ghosts Records still exist after the REBUILD index into "TB_EXAMPLE" table of "WI_Infra" Database
SELECT RECORD_COUNT, GHOST_RECORD_COUNT, INDEX_TYPE_DESC
FROM sys.dm_db_index_physical_stats (DB_ID(N'WI_Infra'), object_id('TB_EXAMPLE'), NULL, NULL , 'DETAILED')
WHERE INDEX_LEVEL = 0
GO
See this output SQL script in the image below.
Alternatively, It's possible to solve these leftover Ghost Records (especially in cases of including more than one record with the same key - Event ID 2512) is recommended to run this DBCC CHECKDB command with a **REPAIR **clauses.
Another option is to run the system procedure sp_clean_db_free_space.
Conclusion
This article proved that exist "Ghosts Records" (and always will exist), because they are part of the internal process of data manipulation.
We can minimize and correct the Ghosts Records that SQL Server cannot eliminate, if we periodically performing the processes of maintaining indexes, as indicated in each step in this article.
This will keep the data paging in our Database organized and therefore, with a good performance in data manipulation.
See Also
- The Ghost in the Transactional Replication Topology
- Fragmentation Percentage of all tables in a SQL database
References
- Understanding Isolation Levels
- Locking and Row Versioning
- Concurrency Effects
- sys.dm_db_index_physical_stats (Transact-SQL)
- sp_clean_db_free_space (Transact-SQL)
- Blog MSDN: An in-depth look at Ghost Records in SQL Server
Tribute
This article was created in memory of an actor Harold Ramis (Dr. Egon) of the Movie trilogy Ghostbusters, that animated part of my childhood.
Other Languages
This article participated in the TechNet Guru February/2014 competition and won the Gold Medal.