SQL Server: Add Data to Table with Columnstore Index Using Partition Switching

Article
1/17/2024

Even though creating a columnstore index on a table makes the table non-updateable, meaning that bulk load and update DML operations on the table will fail, you can still add data to a table with a columnstore index with partition management operations. This can make your ETL process run faster because the alternative is to drop the columnstore index, do your load or update, and then rebuild the index. Since older data is normally not updated, this can result in losing the work required to index older data, and then having to redo it. The following example shows how to add data using partition switching.

/*
 
This example shows how to  add data to a partitioned table with  a columnstore index
using partition switching operations, in a way that can be repeated every time  new 
data needs to  be added, assuming data is added once a day. 
*/ 
use tempdb;
go 
-- Clean up objects left behind from prior runs. 
if object_id('Fact', 'U') is not null
       drop table  Fact;
if object_id('Staging', 'U') is not null
       drop table  Staging;
if exists (select * from sys.partition_schemes where name  = 'ps')
       drop partition scheme ps;
if exists (select * from sys.partition_functions where name  = 'pf')
       drop partition function pf;
go 
-- Create a partition function with 3 partitions, and an associated partition scheme. 
create partition function pf (int) as  range left  for values  (20111012, 20111013);
create partition scheme ps as partition pf all to  ([PRIMARY]);
go 
-- Create a partitioned table. 
create table  Fact (date_key int  not null, measure int)
on ps(date_key);
go 
-- Add some data, leaving an empty partition on each end. 
insert Fact values(20111013, 10);
go 
-- Create a partition-aligned columnstore index on the table. 
create nonclustered columnstore index ncci on Fact(date_key, measure);
go 
-- Show the data and the partition number of each row. 
select $partition.pf(date_key) as partition_number, *
from Fact;
go 
-- Split the uppermost partition (which is now empty, so this is a fast, metadata-only operation). 
alter partition function pf() split range (20111014);
go 
-- Create a staging table with the same schema as the partitioned table, 
-- to hold data to be appended to it via a partition swtich. 
create table  Staging(date_key int  not null, measure int);
go 
-- Create a constraint so that the staging table has the same bounds as the newly created partition. 
alter table  Staging with  check add  constraint bounds check (date_key > 20111013 and date_key <= 20111014) ;
go 
-- Add data to the staging table. 
insert Staging values (20111014, 60);
go 
-- Create a columnstore on the staging table with the same columns in it as the 
-- columnstore index on the target partitioned table. 
create nonclustered columnstore index ncci on Staging(date_key, measure);
go 
-- Switch the staging table into the proper empty partition of the target table. 
-- Afterwards, an empty partition remains as the rightmost partition, 
-- so it can be split later as a metadata-only operation. 
declare @p int = $partition.pf(20111014);
alter table  Staging switch to  Fact partition @p;
go 
-- Retrieve the rows in the partitioned table, showing the partition of each row. 
select $partition.pf(date_key) as partition_number, *
from Fact;
go

Existing techniques and guidelines for using SQL Server table partitioning are documented in other places, and in general they still apply with columnstore indexes. Below are links to other partitioning resources:

Share via

SQL Server: Add Data to Table with Columnstore Index Using Partition Switching

Additional resources