Freigeben über


Built-in functions for compression/decompression in SQL Server 2016

SQL Server 2016 provides built in functions for compression and decompression:

  • COMPRESS - compress data using GZip algorithm and returns binary data.
  • DECOMPRESS - decompress binary data using GZip algorithm and returns binary data. You will need to cast binary data to text if you have originally compressed text.

These functions use standard GZip algorithm so you can compress data in application layer/client side and send compressed data to SQL Server, or you can compress data in SELECT query and DECOMPRESS it in client side.

In this example I will create standard table with binary data that will contain compressed content:

 CREATE TABLE People (
 _id int primary key identity,
 name nvarchar(max),
 surname nvarchar(max),
 info varbinary(max)
)

Now, we can directly load compressed information into this column.

 INSERT INTO People (name, surname, info)
 SELECT FirstName, LastName, COMPRESS(AdditionalInfo) FROM Person.Person
 

We can return compressed data directly to client who can decompress it, or we can decompress data in query:

 SELECT name, surname, DECOMPRESS(info) AS original
 FROM People

As an alternative, we can add computed column (non-persisted) that dynamically decompress data:

 ALTER TABLE People
 ADD info_text as CAST( DECOMPRESS(info) AS NVARCHAR(MAX))

What is compression rate?

You can try it - just create any text, compress it and see the ratio:

 declare @text nvarchar(max) = (select top 100 * from sys.all_objects for json path)
select DATALENGTH(@text) AS original, DATALENGTH(COMPRESS(@text)) AS compressed, 1.0 * DATALENGTH(@text) / DATALENGTH(COMPRESS(@text)) AS ratio 

You can change the number of rows that will be formatted as JSON to change size of text.

With these functions you can choose what data should be compressed in your tables. Compression may help you if you have text data, but if you have already compressed binary content (jpeg, pdf) you might not have good compression ratio and you will just spend CPU cycles.

Comments

  • Anonymous
    December 08, 2015
    Thanks for the tips. I noticed that COMPRESS on text below about 100 characters returns binary data that is longer.

  • Anonymous
    December 08, 2015
    @jwk1 There must exist inputs that are larger after "compression" than raw. This is an information theory law that holds for all compression algorithms. It is quite common that small data is larger because of the base "overhead" of your compression scheme.

  • Anonymous
    December 10, 2015
    Wow, 13.80 Ratio in my test environment.

  • Anonymous
    December 11, 2015
    The comment has been removed

    • Anonymous
      August 10, 2017
      The comment has been removed
  • Anonymous
    December 13, 2015
    @Grzegorz  Did you write an article about that when you did it?  It would have been handy.

  • Anonymous
    December 13, 2015
    Is it compatible with Full-Text Search?

  • Anonymous
    December 13, 2015
    @Jeff Bernet maybe he did not, but I did (BTW this has been out there since 2005): search sql magazine on dont-be-afraid-blobs-and-clobs or SQL Server Central for "Using CLR integration to compress BLOBs/CLOBs in SQL Server 2005".

  • Anonymous
    December 14, 2015
    @Noel Good news!  Thanks for the article!

  • Anonymous
    June 15, 2016
    No that it's out, does any one know if this works in STANDARD edition of SQL Server 2016

    • Anonymous
      June 17, 2016
      Yes, it is available in any edition and Azure SQL Database.