Changes to sp_estimate_data_compression_savings stored procedure in SQL2008R2
When you compress an object (index or table or partition), there are two components of space savings that you get. First component is fragmentaton (i.e. the original object might have been fragmented). The object gets degragmented as part of doing compression so you get some space savings. Second component is actual data compression savings. Many customers have asked the question 'Is there a way to know how much savings do I get from each of the two components?".
Interestingly, starting with SQL2008R2, you indeed have this capability. Here is one example where I create a table t1 and then simulate fragmentation by updating a column. Once the table is fragemented, I can use the sp_estimate_data_compression stored procedure to determine how much space savings I can get by defragmenting it.
drop table t1
go
-- Create the table
create table t1 (c1 int, c2 char(100), c3 varchar (300))
go
-- create a clustered index
create clustered index t1_ci on t1(c1)
-- load 10000 rows. Note, that for column C3, I am storing full-values that
-- take all 300 characters
declare @i int
select @i = 0
while (@i < 10000)
begin
insert into t1 values (@i, 'hello', replicate ('1', 300))
select @i = @i + 1
end
-- Now let us measure the average length of the row. It comes to 417 bytes.
-- This means
that there are 19 rows per page
select avg_record_size_in_bytes
from sys.dm_db_index_physical_stats (db_id('TEMPDB'),
object_id('T1'),
null, null, 'SAMPLED')
where object_name (object_id) like 't1' AND index_id =1
-- check the space usage. This shows the space usage of 4296 KB
sp_spaceused t1
-- This update will cause fragmentation because I am changing the column C3 from 300
-- chars to 50 chars. With the result, the average row length
-- will now become 167 but the SQL Server will still keep 19 rows/page which
-- leaves more that 50% unused space on each page
update t1 set c3 = REPLICATE ('2', 50)
-- check the space usage. It shows that the allocated or reserved space
-- has not changed
sp_spaceused t1
-- estimate how much space you can save by defragmentation. Note
-- here I am executing the stored procedure on an 'uncompressed' table
-- with target compression setting is 'NONE',(i.e. no compression).
exec sp_estimate_data_compression_savings 'dbo',
'T1',1, NULL, 'none'
-- This stored proc returns
object_name schema_name index_id partition_number
------------ ----------- -------- ---------------
T1 dbo 1 1
size_with_current_compression_setting(KB) size_with_requested_compression_setting(KB)
---------------------------------------- ------------------------------------------
4480 1720
-- To verify this, I rebuild the index and see the space consumption
alter index t1_ci on t1 rebuild
-- check the space usage. It shows that the table now takes 1744KB
sp_spaceused t1
In SQL2008RTM, the stored procedure did not compute the space savings if the source and target compression settings were same. In otherwords, the stored proc would have returned the following instead
size_with_current_compression_setting(KB) size_with_requested_compression_setting(KB)
---------------------------------------- ------------------------------------------
4480 4480
Interestingly, you can use it to get tell-tale signs to know if ROW compressed table already takes advantage of unicode compression in SQL2008R2. Let us say customer was using ROW compression in SQL2008 and now upgrades to SQL10.5, s/he can use to the sp_estimate_data_compression_savings stored proc by specifying target compression to be ROW as well. Note, in this case, the compression savings will include the potential savings from data defragmentation as well so one can’t easily deduce that the savings are from Unicode compressions or due to both. However, customer can get some clues by referring to the sys.dm_db_index_physical_stats DMV to see if the object in question was fragmented or not.
Thanks
Sunil