SQL Server: What does fragmentation means for a Heap Table
Scope
The article is outcome of answer given on This StackExchange Thread. The OP was asking about heap fragmentation and has belief that heap fragmentation means the space present on pages allocated to the heap, this is not correct. The article will show what does fragmentation in Heap means
What does Fragmentation Means in a Heap
A heap is table with NO clustered index on it. Technically their is no meaning of fragmentation in heap as they are bunch or un-orderly pages. But when we run DMV sys.dm_db_index_physical_stats it also gives fragmentation for Heaps. The fragmentation value in Heap which you get from column avg_fragmentation_in_percent by querying sys.dm_db_index_physical_stats DMV for Heaps states that
*
Logical fragmentation for indexes, or extent fragmentation for heaps in the IN_ROW_DATA allocation unit.*
Further the same Books Online says that
*This is the percentage of out-of-order extents in the leaf pages of a heap. * An out-of-order extent is one for which the extent that contains the current page for a heap is not physically the next extent after the extent that contains the previous page.
So you can see it is not the free space present in pages allocated to Heap but the varying sequence of pages that creates the fragmentation.
This can be demonstrated by small test. Let us create a Heap Table and insert some records in it and then check the fragmentation.
NOTE: To demonstrate certain database internal features Undocumented commands are used in this article. Reader should not use this command on Production database.
create table dbo.HeapTest
(
Id INT not NULL Default (1),
Col1 char(5000) Not null Default ('Heaps Are Cool')
)
SET NOCOUNT ON
Insert into dbo.Heaptest default values
go 50
select index_type_desc,
avg_fragmentation_in_percent,
fragment_count,
avg_page_space_used_in_percent,
record_count
from sys.dm_db_index_physical_stats(db_id(),object_id('dbo.HeapTest','U'),0,default,'detailed')
So Heap table is created with 50 records in it. Below is what fragmentation looks like after query DMV sys.dm_db_index_physical stats
http://i.stack.imgur.com/cuk1V.png
You can see avg_fragmentation_in_percent column value is 33 %. Now let us see how are pages arranged. This can be done by using undocumented query %%lockres%%. The query would be
SELECT %%lockres%%, * FROM dbo.HeapTest;
And below is what output looks like. Attaching only relevant portion of it. The query produced 50 rows since we inserted 50 rows in our dbo.HeapTest table.
http://i.stack.imgur.com/lOqfA.png
What it says is the first page has ID 197 the next page has ID 242 subsequent pages has continuous ID till we reach page ID 264 because after that we get page ID 280. So this jump in page ID numbers is what actually causing fragmentation.
Now lest rebuild the heap and run The command again to see the fragmentation and how pages are arranged. We get fragmentation like
http://i.stack.imgur.com/UIoE3.png
You can see fragmentation is now 14%.
Let us see page numbers allocated
http://i.stack.imgur.com/mUm85.png
We only have one jump rest all pages are allocated page ID serially. Since just one jump fragmentation decreased considerably.
I rebuild the Heap again and now when I checked fragmentation it was completely gone. And page ID allocation is as per below screenshot
http://i.stack.imgur.com/1GiFL.png
Fragmentation May Increase After Heap Rebuild
In SE thread posted above OP asked what could have caused fragmentation to increase after rebuild, now regarding what could have caused fragmentation to rise we can corroborate it to fact that when pages were getting allocated to the heap they would not be continuous, as you saw above what caused fragmentation value to increase was jump in the PAGE ID's allocated to pages.
At the back of head you should also keep in mind that the word fragmentation for HEAP does not have any meaning, how would you define fragmentation for bunch of un-ordered pages.
Worried about Fragmentation In Heap
If you really face a scenario where heap table is fragmented and slowing queries it would be better creating a clustered index on table than rebuilding it. The reason is when you rebuild heap all underlying Non Clustered indexes are also rebuilt causing the rebuild process to take much longer time, utilizing lot of resources and bloating transaction log. On a production system one would always try to avoid this. Paul covered this in his Myth Section about heap.
If you are using SQL Server 2008 and above you can use Alter Table rebuild command to rebuild the heap. But note this will also rebuild all Non Clustered indexes(NCI) present on the Heap. So for a big Heap table with 3-4 NCI this could impact performance and bloat transaction log.
Suggested Readings
Heap Structures
How Does SQL Server Allocates Space In a Heap