SQL Server: What does Column Compressed Page Count Value Signify in DMV Sys.dm_db_index_physical_stats ?
Scope
This Article talks about what does the column compressed_page_count value signifies in Sys. dm_db_index_physical_stats when this DMV is run for table having Clustered Index with Page Compression enabled
Background
Compression was introduced from SQL Server 2008 and is Enterprise Only feature, but can be used in Developer and Evaluation editions. Compression can be applied both at row and page level. We can use compression for Heap table , Clustered index, Non Clustered index, Index Views as well as for partitioned tables and indexes. You can read more about compression from below Microsoft SQL Server online resource
Creating Compressed Tables and Indexes
According to Books Online resource for Sys.dm_db_index_physical_stats the definition of Column compressed_page_count is
The number of compressed pages.
For heaps, newly allocated pages are not PAGE compressed. A heap is PAGE compressed under two special conditions: when data is bulk imported or when a heap is rebuilt. Typical DML operations that cause page allocations will not be PAGE compressed. Rebuild a heap when the compressed_page_count value grows larger than the threshold you want.
For tables that have a clustered index, the compressed_page_count value indicates the effectiveness of PAGE compression.
Does it not sounds confusing for clustered index?. We would try to demystify above line to best possible extent and understand what it actually means.
Note
Before we start I would like to inform the reader that this article uses some of Undocumented Commands like DBCC PAGE, Sys.dm_db_database_page_allocation and Sys.fn_physlockformatter these commands are not supported by Microsoft and were used to just show internal behavior of Database Engine its not advisable to use these commands on production database.
Performing The Test
To perform the test and to understand what does the column value compressed_page_count signify we will use SQL Server 2012 Evaluation edition. One can use Enterprise as well as Developer editions as both these editions support row and page compression. First we would create a table in any user database( I have used database named Indexlogging. This database name would be used in query further so just to not create confusion, database name used was mentioned) and then insert few records in it. Then we will create clustered index with page compression.
Query 1
USE [IndexLogging]--Please replace it with any database name where you want to create this table
GO
DROP TABLE [dbo].[INDEXCOMPRESSION]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[INDEXCOMPRESSION](
[C1] [int] IDENTITY(1,1) NOT NULL,
[C2] [char](50) NULL
) ON [PRIMARY]
GO
--Insert default records into filler table
INSERT INTO [DBO].[INDEXCOMPRESSION] DEFAULT VALUES;
GO 1000
--Script to create clustered index
CREATE UNIQUE CLUSTERED INDEX CI_INDEXCOMPRESSION ON [DBO].[INDEXCOMPRESSION](C1)
WITH (DATA_COMPRESSION = PAGE)
GO
Now we will use DMV Sys. dm_db_index_physical_stats to analyze the compressed index.
Query 2
SELECT O. NAME,
IPS.INDEX_TYPE_DESC,
IPS.RECORD_COUNT,
IPS.PAGE_COUNT,
IPS.COMPRESSED_PAGE_COUNT
FROM
SYS.DM_DB_INDEX_PHYSICAL_STATS
( DB_ID (N'INDEXLOGGING'), OBJECT_ID (N'INDEXCOMPRESSION'), 1, NULL, 'DETAILED') IPS
JOIN SYS.OBJECTS O ON O.OBJECT_ID = IPS.OBJECT_ID
ORDER BY RECORD_COUNT DESC;
Below is output from the query
Figure 1
We can see from the output that compressed_page_count has value 1. You can also see leaf level of index has two pages(Page_count=2). And 1000 records are distributed among these two pages . But curious thing to note here is that compressed_page_count shows value 1. Does this means only one page is compressed and other is not ?
Well lets see page details associated with this Index. This can be done in 2012 by using Undocumented DMV Sys. dm_db_database_page_allocations
Query 3
--QUERY TO GET PAGE INFORMATION
USE INDEXLOGGING
GO
SELECT ALLOCATED_PAGE_FILE_ID,
ALLOCATED_PAGE_PAGE_ID,
PAGE_TYPE_DESC
FROM SYS.DM_DB_DATABASE_PAGE_ALLOCATIONS
(DB_ID('INDEXLOGGING'), OBJECT_ID('INDEXCOMPRESSION'), NULL, NULL, 'DETAILED')
ORDER BY
ALLOCATED_PAGE_PAGE_ID,
PAGE_TYPE_DESC
Below is output of the query
Figure 2
We have two data pages one index pages and one IAM page. Leaf level of index has two pages 169 and 177 while root level has page 178. Now we will see how many records each page has and this can be seen using Undocumented command sys.fn_PhysLocFormatter below is the query which will help us look into records present on each page
Query 4
SELECT SYS.FN_PHYSLOCFORMATTER (%%PHYSLOC%%) AS PAGE_DETAILS, * FROM [DBO].[INDEXCOMPRESSION]
Below is output
Figure 3
You can see output has total 1000 rows returned. In above figure what is pasted is part of the total output . If you see page details column it has like 1:169:XXX and 1:177:XXX here 1 is file id and 169,177 are page Numbers and XXX is the row number. So from 1 till 733 all records are stored on page number 169 while from 734 all records are stored on page 177( Please see Figure 2 ) for page details.
Important thing to note here is page 169 has 733 rows on it while page 177 has just 267 rows
Now we will dig deeper and have look at both page 169 and page 177 using Undocumented Command DBCC PAGE
Query 5
--QUERY TO GET DETAILS ABOUT A PAGE
DBCC TRACEON (3604)
GO
DBCC PAGE (INDEXLOGGING,1,169,1)
Below is the output. Output includes a segment of a total output please drill down to have complete look.
This confirms that page 169 has data 'CompressedData' and the page is compressed. CompressedData is the value we inserted when we were inserting records in table
Now lets have a look at details of page 177 using same query
Query 6
--QUERY TO GET DETAILS ABOUT A PAGE
DBCC TRACEON (3604)
GO
DBCC PAGE (INDEXLOGGING,1,177,1)
From above figure we can say that page is compressed.
So now question arises why Sys.dm_db_index_physical_stats gave value compressed_page_count as 1 when both the pages are compressed
To dig more deeper lets create same table with 5000 records and run **Query 2 ** to see the value of compressed_page_count
As you can see now we have index with 7 pages as leaf level and 500 record is distributed on these pages and now compressed_page_count is showing value 7.
For 100 records Query 2 produced below output
For 100 records the value is zero, interesting
How page compression occurs
According to Microsoft Books Online Resource when a new table is created that has page compression, no compression occurs. However, the metadata for the table indicates that page compression should be used. As data is added to the first data page, data is row-compressed. Because the page is not full, no benefit is gained from page compression. When the page is full, the next row to be added initiates the page compression operation. The whole page is reviewed; each column is evaluated for prefix compression, and then all columns are evaluated for dictionary compression. If page compression has created enough room on the page for an additional row, the row is added, and the data is both row- and page-compressed. If the space gained by page compression minus the space that is required for the CI structure is not significant, page compression is not used for that page. Future rows either fit onto the new page or, if they do not fit, a new page is added to the table. Similar to the first page, the new page is not at first page-compressed.
The one marked in italics is interesting and we would take our research further using this line. What it says( in a simple language) that unless page is fully utilized and Microsoft SQL Server engine finds it *Fit* for compression it wont be compressed physically. This also means that a page which is not full to some predefined value wont be counted as Compressed page even though its meta data will show it is compressed. The predefined value lies with Database Engine. Below test will prove it.
We saw that each page is accommodating a maximum of 733 rows ( see figure 3) also have a look at output of Sys. fn_physlockformatter. For 1000 page test we saw that first 733 rows went to page 169(Figure 3) and rest 267 went to page 177. Now value 277 is even less that half of 733 so since page 177 was less than half filled it was not found qualified by Database Engine as candidate for being compressed so value compressed_page_count was 1. Now we would add more rows so as to prove that if more rows goes into page 177 it would be qualified to be marked as Compressed. We have 267 rows in page 177 we will add 100 more so that it becomes half of 733( total number of rows present on the page). So now we would drop the table IndexCompression and create it again but this time with 1100 rows every thing would remain same just insert query would be like
Query 7
INSERT INTO [DBO].[INDEXCOMPRESSION] DEFAULT VALUES;
GO 1100
After this we would again run Query 2 to find out value of compressed_page_count
Wow we can see now for 1100 records the value of column compressed_page_count is 2. Which means now with page 177 half full it was qualified as being Compressed.
Conclusion
We saw that the value for column compressed_page_count does not actually means number of pages compressed ALWAYS. This value is actually updated when SQL server Database engine finds out that page has enough rows to take benefit from compression. I am sure this wont be just the criteria it would also be looking at various other parameters to find out whether the page is fit to be considered as compressed.
Suggested Readings
Page Compression in SQL Server
Row Compression in SQL Server
DMV sys. dm_db_index_physical_stats
See Also
SQL Server General & Database Engine Resources on the TechNet Wiki
List Of Articles By Shanky