Partial results in a query of a clustered columnstore index in SQL Server 2014
As part of continuous improvement of the quality of SQL Server, we add new tests to stress the features in various ways. We recently discovered a corruption issue in clustered columnstore index in SQL Server 2014 as part of new stress tests. After analyzing the root cause, we have concluded that this corruption is unlikely to happen in customer workloads. However, we are releasing a hotfix to detect and prevent future corruptions. This issue does not impact non-clustered columnstore index either in SQL Server 2012 and SQL Server 2014.
How can I find out if I am impacted by corruption before applying the fix?
You can run the following query:
SELECT Object_name(i.object_id) AS table_name,
i.NAME AS index_name,
p.partition_number,
Count(DISTINCT s.segment_id) AS damaged_rowgroups
FROM sys.indexes i
JOIN sys.partitions p
ON p.object_id = i.object_id
AND p.index_id = i.index_id
JOIN sys.column_store_row_groups g
ON g.object_id = i.object_id
AND g.index_id = i.index_id
AND g.partition_number = p.partition_number
JOIN sys.column_store_segments s
ON s.partition_id = p.partition_id
AND s.segment_id = g.row_group_id
WHERE i.type IN ( 5, 6 )
AND s.secondary_dictionary_id <> -1
AND g.state_description = 'COMPRESSED'
AND s.secondary_dictionary_id NOT IN
(SELECT dictionary_id
FROM sys.column_store_dictionaries d
WHERE d.hobt_id = p.hobt_id
AND d.column_id = s.column_id)
GROUP BY Object_name(i.object_id),
i.NAME,
p.partition_number
How does corruption occur and what is the impact: Please refer to KB 3067257 for the detailed reason. You can see that it is a rare scenario. Note, if your clustered columnstore index is corrupt, the queries will continue to run successfully but may skip some rows.
What happens when I apply the hotfix: For most customers, there will be no impact as in all likelihood, your columnstore index is not corrupt. However, under rare situation when the columnstore index is indeed corrupt, the queries accessing the corrupt data will now start failing unlike before the fix when the queries were just skipping the corrupt data. You can also detect corruption by running DBCC CHECKTABLE command that has been enhanced to detect it.
What do I do if my columnstore index is corrupt: As a first step, you can use the TF-10207 to let the queries run successfully while you work to resolve the corruption. If in your scenario, you have a copy of the data available externally either in operational store(s) or external file(s), you can drop and recreate the clustered columnstore index and then load the data.
If there is no copy of the data available, then unfortunately, this is a data loss situation. You can't really use database/log backups to recover but in some cases, you will be able to minimize the data loss. For example, if you have determined that only the data loaded since yesterday is corrupt, you could do a point in time restore to yesterday and then load the data from today.
As a last resort, you can use the TF-10207 to export the data out of columnstore into an external file(s) or table and then drop/recreate the clustered columnstore index and then load the data. You will lose the data that was corrupted.