SQL Server In depth: What can Cause Index to be Still Fragmented After Rebuild
Introduction
You would have always come across scenario where even after rebuilding the Index fragmentation would not have decreased or would have not changed much or even would have increased. A quite few of us know that when page count for an Index, which can be seen from sys.dm_db_index_physical_stats, is very less (perhaps less than 1000) one should not consider rebuilding that index because fragmentation on index with small page count would hardly cause any I/O issue and fact being that since index is small pages are most likely to be in memory so no physical I/O would be required to get page and hence fragmentation would not come in picture. But what happens internally which causes index to be still fragmented after rebuild ?
Scope
This article will try to show the reader how and why fragmentation still remains in index even after rebuild. Please also note that its not necessary that index rebuild for indexes which have page_count value >1000 will severely decrease fragmentation. As a fact there are lot of scenarios where index rebuild for large tables did not affected fragmentation much.
Reason
Microsoft Books Online (The SQL Server 2000 BOL is now removed by MS so the link is not present. Although its for SQL Server 2000 but gives very accurate definition )says that there is hardly any performance gain in rebuilding index with page_count <1000.
*Fragmentation affects disk I/O. Therefore, focus on the larger indexes because their pages are less likely to be cached by SQL Server. Use the page count reported by DBCC SHOWCONTIG to get an idea of the size of the indexes (each page is 8 KB in size). Generally, you should not be concerned with fragmentation levels of indexes with less than 1,000 pages. In the tests, indexes containing more than 10,000 pages realized performance gains, with the biggest gains on indexes with significantly more pages (greater than 50,000 pages).
*
Below is reply from Microsoft Team on a This Connect Item which was raised to understand why fragmentation did not decrease even after rebuild.
For small tables, usually performance impact on fragmentation is undetectable. The first 8 page allocation would be from mixed extents and mixed extents could be anywhere in database files. Rebuilding indexes would not change this nature.
*** If you have a small table, those mixed pages weight a lot during fragmentation calculation; therefore, rebuilding index may not reduce fragmentation. (As matter of fact, I could easily construct a case that fragmentation increases after rebuild.) Those fragmentation would not be a pain for your query performance; so basically you can ignore***
Reason as to why fragmentation remains for small index even after rebuild is that after rebuild the pages which are allocated to index is from Mixed extent. A mixed extent contains mixed pages and extent is collection of 8 pages, these are always the first 8 pages which would be allocated to database when it requires pages to write information. The first 8 pages will always be from mixed extent and after that it would allocate Uniform extents. Reason for allocating first 8 pages from mixed extent is database engine assumes that its quite possible table would be small(at the beginning) and there would not be much advantage in allocating uniform extent,so it internally decides to allocate first 8 pages from mixed extent. As the 8 pages limit is crossed it would start allocating Uniform extents. As the mixed extent is not allocated to any particular IAM chain, this means that it may hold pages allocated to possibly 8 separate IAM . This is very important fact. The first 8 pages allocated by mixed extent could be scattered anywhere and this is what accounts for fragmentation even after rebuild.
What is Fragmentation?
Now understand that logical fragmentation is not introduction of free space or some kind of space between rows present in index. This is other myth which needs attention. logical fragmentation is just mismatch between Logical ordering of Index keys and physical ordering of data at leaf level of index.Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file. All leaf pages of an index contain pointers to the next and the previous pages in the index. This forms a doubly linked list of all index/data pages. Ideally, the physical order of the pages in the data file should match the logical ordering. When mismatch happens fragmentation occurs. The plausible reason as to why index fragmentation can cause performance issue is because when the physical ordering does not match the logical ordering, disk throughput can become less efficient, because the disk head must move back and forth to gather the index pages instead of scanning forward in one direction. On busy system with lot of joins in a query which causes lot of reads index fragmentation can cause grave issues.
Note
This article uses FN_PHYSLOCCRACKERS and DBCC PAGE command which are undocumented commands. Use of such commands were required to prove certain facts related to SQL Server. Undocumented means Microsoft does not support these commands, changes can be made to these commands without giving any information to users so unexpected result might occur. So I advise readers not to use these commands on production database. You can play around with these commands on Test databases.
Performing Test To Show The Behavior
We would try to reproduce the behavior. Lets create a database and then create two tables in it and then finally delete one table and shrink database to introduce fragmentation
Query 1
--Change data and log file location as per configuration USE [master]
GO
--Create Database Command
CREATE DATABASE [IndexRebuild]
ON PRIMARY
( NAME = N'IndexRebuild_dat',
FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexRebuild_data.mdf' ,
SIZE = 51200KB ,
MAXSIZE = 51200KB ,
FILEGROWTH = 15 %)
LOG ON
( NAME = N'IndexRebuild_log',
FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexRebuild_log.ldf' ,
SIZE = 10240KB ,
MAXSIZE = 25600KB ,
FILEGROWTH = 5120KB );
GO
Now we would create two tables insert few records and then delete one of the table followed by shrink operation to introduce fragmentation.
Query 2
USE [INDEXREBUILD]
GO
--CREATE FILLER TABLE
CREATE TABLE [DBO].[FILLER]
(
C1 INT ,
C2 VARCHAR(500)
);
GO
--INSERT DEFAULT RECORDS INTO FILLER TABLE
INSERT INTO [DBO].[FILLER](C1,C2) VALUES (REPLICATE('1', 4), REPLICATE ('X', 500))
GO 10000
CREATE TABLE [DBO].[IXBUILD]
(
C1 INT IDENTITY (1,1) ,
C2 VARCHAR(500)
);
GO
--CREATE UNIQUE CLUSTER INDEX
CREATE UNIQUE CLUSTERED INDEX IX_IXBUILD ON [DBO].[IXBUILD] (C1)
GO
--INSERT DEFAULT RECORDS
INSERT INTO [DBO].[IXBUILD](C2) VALUES (REPLICATE ('X', 500))
GO 1000
Now delete Filler table and shrink file.
Query 3
DROP TABLE DBO.FILLER;
GO
--SHRINKING DATA FILE TO BRING FRAGMENTATION
DBCC SHRINKFILE (INDEXREBUILD_DAT, 3)
GO
SELECT
[AVG_FRAGMENTATION_IN_PERCENT]
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (
DB_ID (N'INDEXREBUILD'), OBJECT_ID (N'IXBUILD'), 1, NULL, 'LIMITED');
You would see some fragmentation in my case it was about 88 %, fragmentation may vary as per environment . So now we have created a index which is fragmented lets see page count for Index
Query 4
SELECT O. NAME,
IPS.INDEX_TYPE_DESC,
IPS.PAGE_COUNT,
IPS.RECORD_COUNT
FROM
SYS.DM_DB_INDEX_PHYSICAL_STATS
( DB_ID (N'INDEXREBUILD'), OBJECT_ID (N'IXBUILD'), 1, NULL, 'DETAILED') IPS
JOIN SYS.OBJECTS O ON O.OBJECT_ID = IPS.OBJECT_ID
ORDER BY RECORD_COUNT DESC;
Our Index has page count of 67 which is much less than 1000 value mark.
Lets analyze page number
** Query 5**
select count(*) as NumOfRows,page_id from [dbo].[IXBUILD]
cross apply
sys.fn_PhysLocress(%%physloc%%)
group by page_id
order by page_id
We can see wide difference in page number specially from 156 to 162 an from 239 to 248, all marked with rectangle. Further going down we can see
We can again see difference in page number in some case, perhaps they are widely separated. This is what is FRAGMENTATION in SQL Server
You can analyze the same behavior with below query. This would also dump information the page is holding
Query 6
SELECT SYS.FN_PHYSLOCFORMATTER (%%PHYSLOC%%) AS PAGE_DETAILS, * FROM [DBO].[IxBuild]
Below output is just part of the complete output
Lets concentrate on data page 178. Lets analyze it using DBCC PAGE command.
** Query 7**
--QUERY TO GET DETAILS ABOUT A PAGE
DBCC TRACEON (3604)
GO
DBCC PAGE (INDEXREBUILD,1,178,1)
We would only consider page header as in our case only this part has relevant information. Output including just page header is shown below
As per page header since m_type=1 its a data page. PFS pages says that this page is allocated from mixed extent. There are two important information here
m_prevPage= This value shows the page id of page just before it
m_nextPage= This value shows the page id of page following it
Now as you can see previous page that was allocated was 166 then came our page 178 an then 347. This means pages are not in order and if SQL Server would have to read this page it would have to move first to 166 then may be move back/or to different location on disk read 178 and then to different location for 347 this means a lot of work for SQL Server thus increasing I/O and this is actually why fragmentation is bad. Why this happened ? reason is simple because pages were less so they were allocated from mixed extent and hence were randomly scattered
Now its little difficult to produce a scenario where index rebuild would not reduce fragmentation as fact if you follow above database and rebuild index which is fragmented it would reduce fragmentation reason being DB engine can find almost contiguous pages event though they would be allocated from mixed extent. So we should use database which has lots of tables and so we would use Adventureworks2012 database. It can be downloaded from Here
Its easy to find out a fragmented primary key with low page count. And in my case it would be PK_ProductListPriceHistory_ProductID_StartDate which is created on table ProductListPriceHistory. Its easy to analyze fragmentation using below query
We can see page count as 3 and fragmentation as 66 %. Now lest see page distribution
As per index stats result index has three pages and these are 18348,50 and 76. Again not in complete order
Lets rebuild index using below query
** Query 8**
ALTER INDEX PK_ProductListPriceHistory_ProductID_StartDate
ON production.ProductListPriceHistory
REBUILD
Now if you check fragmentation it would remain same as 66%. Lets see how page distribution changed
We can see index is rebuilt and pages allocated now are completely different. Now pages are 18377,79 and 81 so index was rebuilt but again pages were given from mixed extent and were not contiguous hence fragmentation remains. A look at page 18379 using DBCC page shows below
We can see page is allocated from mixed extent with previous page as 77 and next page as 81
So question again comes why pages are scattered even if rebuild command was given. Reason is when index was rebuilt it is dropped and recreated and when pages were allocated to index during recreation it done from mixed extent (reason being index was very small) and now as already mentioned pages in mixed extent are randomly scattered and are not in any order as a fact a mixed page can possibly be allocated to 8 separate IAM pages. This is the reason why we had page 126 and then 142. This is also the reason why fragmentation still remains there.
User can reproduce same issue in UAT if he is facing the same scenario defined in this article. One would note that pages allocated would be from mixed extent and would not be in order.
Conclusion
When Index is rebuilt and index is having fewer amount of pages, pages to index are allocated from mixed extent which could be lying anywhere and this seems to user that fragmentation is still there. But is actually how data pages are allocated to new index with small page count, by virtue of fact that mixed pages can be lying anywhere and when these pages are allocated to index it seems to end user that fragmentation is still there but this is actually default behavior of Database Engine. pages are kind of Scattered. Since they are scattered hence they create what seems like fragmentation but actually its how Database engine allocates pages. It would also be correct to say that such indexes are not fragmented considering the real world meaning of fragmentation, which means something which causes performance issue.
Note
User might not be able to actually reproduce same as author,page number allocated would ofcourse be different. Aim of article is to show user what actually fragmentation means and why fragmentation still remains after index rebuild.
See Also