Share via


SQL Server In depth: What can Cause Index to be Still Fragmented After Rebuild

   


Introduction

You would have always come across scenario where even after rebuilding the Index fragmentation would not have decreased or would have not changed much or even would have increased. A quite few of us know that when page count for an Index, which can be seen from sys.dm_db_index_physical_stats, is very less (perhaps less than 1000) one should not consider rebuilding that index because fragmentation on index with small page count would hardly cause any I/O issue  and fact being that since index is small pages are most likely to be in memory so no physical I/O would be required to get page and hence  fragmentation would not come in picture. But what happens internally which causes index to be still fragmented after rebuild ?  


Scope

This article will try to show the reader how and why fragmentation still remains in index even after rebuild. Please also note that its not necessary that index rebuild for indexes which have page_count value >1000 will severely decrease fragmentation. As a fact there are lot of scenarios where index rebuild for large tables did not affected fragmentation much.  


Reason

Microsoft Books Online (The SQL Server 2000 BOL is now removed by MS so the link is not present. Although its for SQL Server 2000 but gives very accurate definition )says that there is hardly any performance gain in rebuilding index with page_count <1000.

*Fragmentation affects disk I/O. Therefore, focus on the larger indexes because their pages are less likely to be cached by SQL Server. Use the page count reported by DBCC SHOWCONTIG to get an idea of the size of the indexes (each page is 8 KB in size). Generally, you should not be concerned with fragmentation levels of indexes with less than 1,000 pages. In the tests, indexes containing more than 10,000 pages realized performance gains, with the biggest gains on indexes with significantly more pages (greater than 50,000 pages).
*
Below is reply from Microsoft  Team on a This Connect Item which was raised to understand why fragmentation did not decrease even after rebuild.

 For small tables, usually performance impact on fragmentation is undetectable. The first 8 page allocation would be from mixed extents and mixed extents could be anywhere in database files. Rebuilding indexes would not change this nature.

*** If you have a small table, those mixed pages weight a lot during fragmentation calculation; therefore, rebuilding index may not reduce fragmentation. (As matter of fact, I could easily construct a case that fragmentation increases after rebuild.) Those fragmentation would not be a pain for your query performance; so basically you can ignore***

Reason as to why fragmentation remains for small index even after rebuild is that after rebuild the pages which are allocated to index is from Mixed extent. A mixed extent contains mixed pages and extent is collection of 8 pages, these are always  the first 8 pages which would be allocated to database when it requires pages to write information. The first 8 pages will always be from mixed extent and after that it would allocate  Uniform extents. Reason for allocating first 8 pages from mixed extent is database engine assumes that its quite possible table would be small(at the beginning) and there would not be much advantage in allocating uniform extent,so it internally decides to allocate first  8 pages from mixed extent. As the 8 pages limit is crossed it would start allocating Uniform extents.  As the mixed extent is not allocated to any particular IAM chain, this means that it may hold pages allocated to possibly 8 separate IAM . This is very important fact. The first 8 pages allocated by mixed extent could be scattered anywhere and this is what accounts for fragmentation even after rebuild.  


What is Fragmentation?

Now understand that logical fragmentation is not introduction of free space or some kind of space between rows present in index. This is other myth which needs attention. logical fragmentation is just mismatch between Logical ordering of Index keys and physical ordering of data at leaf level of index.Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file. All leaf pages of an index contain pointers to the next and the previous pages in the index. This forms a doubly linked list of all index/data pages. Ideally, the physical order of the pages in the data file should match the logical ordering. When mismatch happens fragmentation occurs. The plausible reason as to why index fragmentation can cause performance issue is because when the physical ordering does not match the logical ordering, disk throughput can become less efficient, because the disk head must move back and forth to gather the index pages instead of scanning forward in one direction. On busy system with lot of joins in a query which causes lot of reads index fragmentation can cause grave issues. 

Note

This article uses FN_PHYSLOCCRACKERS and DBCC PAGE command which are undocumented commands. Use of such commands were required  to prove certain facts related to SQL Server. Undocumented means Microsoft does not support these commands, changes can be made to these commands without giving any information to users so unexpected result might occur. So I advise readers not to use these commands on production database. You can play around with these commands on Test databases.

 


Performing Test To Show The Behavior

We would try to reproduce the behavior. Lets create a database and then create two tables in it and then finally delete one table and shrink database to introduce fragmentation

  Query 1  

 --Change data and log file location as per configuration  USE [master]
GO
   
--Create Database Command
CREATE DATABASE  [IndexRebuild]
 ON  PRIMARY
( NAME = N'IndexRebuild_dat',
 FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexRebuild_data.mdf' ,
 SIZE = 51200KB ,
 MAXSIZE = 51200KB ,
  FILEGROWTH = 15 %)
 LOG ON
( NAME = N'IndexRebuild_log',
FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexRebuild_log.ldf' ,
SIZE = 10240KB ,
MAXSIZE = 25600KB ,
FILEGROWTH = 5120KB );
GO

Now we would create two tables insert few records and then delete one of the table followed by shrink operation to introduce fragmentation.

  Query 2 

USE [INDEXREBUILD]
   
GO
--CREATE FILLER TABLE
CREATE TABLE  [DBO].[FILLER]
(
C1  INT  ,
C2 VARCHAR(500)  
);
GO
   
--INSERT DEFAULT RECORDS INTO FILLER TABLE
   
INSERT INTO  [DBO].[FILLER](C1,C2) VALUES (REPLICATE('1', 4), REPLICATE ('X', 500))
  
GO 10000
   
 
CREATE TABLE  [DBO].[IXBUILD]
(
C1  INT  IDENTITY (1,1) ,
C2 VARCHAR(500)  
);
GO
   
--CREATE UNIQUE CLUSTER INDEX
   
CREATE UNIQUE  CLUSTERED INDEX  IX_IXBUILD ON  [DBO].[IXBUILD] (C1)
GO
--INSERT DEFAULT RECORDS
 
INSERT INTO  [DBO].[IXBUILD](C2) VALUES (REPLICATE ('X', 500))
  
GO 1000

  Now delete Filler table and shrink file.

  Query 3 

DROP TABLE  DBO.FILLER;
GO
--SHRINKING DATA FILE TO BRING FRAGMENTATION
DBCC SHRINKFILE (INDEXREBUILD_DAT, 3)
GO
SELECT
[AVG_FRAGMENTATION_IN_PERCENT]
FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (
 DB_ID (N'INDEXREBUILD'), OBJECT_ID (N'IXBUILD'), 1, NULL,  'LIMITED');

You would see some  fragmentation in my case it was about 88 %, fragmentation may vary as per environment . So now we have created a index which is fragmented lets see page count for Index

  Query 4 

SELECT  O. NAME,
    IPS.INDEX_TYPE_DESC,
     IPS.PAGE_COUNT,
     IPS.RECORD_COUNT
FROM
     SYS.DM_DB_INDEX_PHYSICAL_STATS
  ( DB_ID (N'INDEXREBUILD'), OBJECT_ID (N'IXBUILD'), 1, NULL,  'DETAILED') IPS
JOIN SYS.OBJECTS O ON O.OBJECT_ID = IPS.OBJECT_ID
ORDER BY  RECORD_COUNT DESC;

 

  Our Index has page count of 67 which is much less than 1000 value mark.

  Lets analyze page number 

**  Query 5** 

select count(*) as NumOfRows,page_id from [dbo].[IXBUILD]
cross apply
sys.fn_PhysLocress(%%physloc%%)
group by  page_id
order by  page_id    

 

  We can see wide difference in page number specially from 156 to 162 an from 239 to 248, all marked with rectangle. Further going down we can see

 

  We can again see difference in page number in some case, perhaps they are widely separated. This is what is FRAGMENTATION in SQL Server

  You can analyze the same behavior with below query. This would also dump information the page is holding

  Query 6 

SELECT SYS.FN_PHYSLOCFORMATTER (%%PHYSLOC%%) AS PAGE_DETAILS, * FROM [DBO].[IxBuild]

  Below output is just part of the complete output

 

  Lets concentrate on data page 178. Lets analyze it using DBCC PAGE command. 

**  Query 7** 


--QUERY TO GET DETAILS ABOUT A PAGE
DBCC TRACEON (3604)
GO
DBCC PAGE (INDEXREBUILD,1,178,1)

  We would only consider page header as in our case only this part has relevant information. Output including just page header is shown below

 

  As per page header since m_type=1 its a data page. PFS pages says that this page is allocated from mixed extent. There are two important information here

  m_prevPage=  This value shows the page id of page just before it

  m_nextPage= This value shows the page id of page following it

Now as you can see previous page that was allocated was 166 then came our page 178 an then 347. This means pages are not in order and if SQL Server would have to read this page it would have to move first to 166 then may be move back/or to different location on disk read 178 and then to different location for 347 this means a lot of work for SQL Server thus increasing I/O and this is actually why fragmentation is bad. Why this happened ? reason is simple because pages were less so they were allocated from mixed extent and hence were randomly scattered

Now its little difficult to produce a scenario where index rebuild would not reduce fragmentation as  fact if you follow above database and rebuild index which is fragmented it would reduce fragmentation reason being DB engine can find almost contiguous pages event though they would be allocated from mixed extent. So we should use database which has lots of tables and so we would use Adventureworks2012 database. It can be downloaded from Here

Its easy to find out a fragmented primary key with low page count. And in my case it would be PK_ProductListPriceHistory_ProductID_StartDate which is created on table ProductListPriceHistory. Its easy to analyze fragmentation using below query

 

  We can see page count as 3 and fragmentation as 66 %. Now lest see page distribution

 

As per index stats result index has three pages and these are 18348,50 and 76. Again not in complete order

Lets rebuild index using below query

  **  Query 8** 

ALTER INDEX  PK_ProductListPriceHistory_ProductID_StartDate
 ON production.ProductListPriceHistory
REBUILD

  Now if you check fragmentation it would remain same as 66%. Lets see how page distribution changed

 

  We can see index is rebuilt and pages allocated now are completely different. Now pages are 18377,79 and 81 so index was rebuilt but again pages were given from mixed extent and were not contiguous hence fragmentation remains. A look at page 18379 using DBCC page shows below

 

We can see page is allocated from mixed extent with previous page as 77 and next page as 81

So question again comes why pages are scattered even if rebuild command was given. Reason is when index was rebuilt it is dropped and recreated and when pages were allocated to index during recreation it done from mixed extent (reason being index was very small) and now as already mentioned pages in mixed extent are randomly scattered and are not in any order as a fact a mixed page can possibly be allocated to 8 separate IAM pages. This is the reason why we had page 126 and then 142. This is also the reason why fragmentation still remains there.  

 User can reproduce same issue in UAT if he is facing the same scenario defined in this article. One would note that pages allocated would be from mixed extent and would not be in order.

 


Conclusion

When Index is rebuilt and index is having fewer amount of pages, pages to index are allocated from mixed extent which could be lying anywhere and this seems to user that fragmentation is still there. But is actually how data pages are allocated to new index with small page count, by virtue of fact that mixed pages can be lying anywhere and when these pages are allocated to index it seems to end user that fragmentation is still there but this is actually default behavior of Database Engine. pages are kind of Scattered. Since they are scattered hence they create what seems like fragmentation but actually its how Database engine allocates pages. It would also be correct to say that such indexes are not fragmented considering the real world meaning of fragmentation, which means something which causes performance issue.

 


Note

User might not be able to actually reproduce same as author,page number allocated would ofcourse be different. Aim of article is to show user what actually fragmentation means and why fragmentation still remains after index rebuild.

 


See Also