Built-in functions for compression/decompression in SQL Server 2016
SQL Server 2016 provides built in functions for compression and decompression:
- COMPRESS - compress data using GZip algorithm and returns binary data.
- DECOMPRESS - decompress binary data using GZip algorithm and returns binary data. You will need to cast binary data to text if you have originally compressed text.
These functions use standard GZip algorithm so you can compress data in application layer/client side and send compressed data to SQL Server, or you can compress data in SELECT query and DECOMPRESS it in client side.
In this example I will create standard table with binary data that will contain compressed content:
CREATE TABLE People (
_id int primary key identity,
name nvarchar(max),
surname nvarchar(max),
info varbinary(max)
)
Now, we can directly load compressed information into this column.
INSERT INTO People (name, surname, info)
SELECT FirstName, LastName, COMPRESS(AdditionalInfo) FROM Person.Person
We can return compressed data directly to client who can decompress it, or we can decompress data in query:
SELECT name, surname, DECOMPRESS(info) AS original
FROM People
As an alternative, we can add computed column (non-persisted) that dynamically decompress data:
ALTER TABLE People
ADD info_text as CAST( DECOMPRESS(info) AS NVARCHAR(MAX))
What is compression rate?
You can try it - just create any text, compress it and see the ratio:
declare @text nvarchar(max) = (select top 100 * from sys.all_objects for json path)
select DATALENGTH(@text) AS original, DATALENGTH(COMPRESS(@text)) AS compressed, 1.0 * DATALENGTH(@text) / DATALENGTH(COMPRESS(@text)) AS ratio
You can change the number of rows that will be formatted as JSON to change size of text.
With these functions you can choose what data should be compressed in your tables. Compression may help you if you have text data, but if you have already compressed binary content (jpeg, pdf) you might not have good compression ratio and you will just spend CPU cycles.
Comments
Anonymous
December 08, 2015
Thanks for the tips. I noticed that COMPRESS on text below about 100 characters returns binary data that is longer.Anonymous
December 08, 2015
@jwk1 There must exist inputs that are larger after "compression" than raw. This is an information theory law that holds for all compression algorithms. It is quite common that small data is larger because of the base "overhead" of your compression scheme.Anonymous
December 10, 2015
Wow, 13.80 Ratio in my test environment.Anonymous
December 11, 2015
The comment has been removed- Anonymous
August 10, 2017
The comment has been removed
- Anonymous
Anonymous
December 13, 2015
@Grzegorz Did you write an article about that when you did it? It would have been handy.Anonymous
December 13, 2015
Is it compatible with Full-Text Search?Anonymous
December 13, 2015
@Jeff Bernet maybe he did not, but I did (BTW this has been out there since 2005): search sql magazine on dont-be-afraid-blobs-and-clobs or SQL Server Central for "Using CLR integration to compress BLOBs/CLOBs in SQL Server 2005".Anonymous
December 14, 2015
@Noel Good news! Thanks for the article!Anonymous
June 15, 2016
No that it's out, does any one know if this works in STANDARD edition of SQL Server 2016- Anonymous
June 17, 2016
Yes, it is available in any edition and Azure SQL Database.
- Anonymous