SQL Server: Add Data to Table with Columnstore Index Using Partition Switching
Even though creating a columnstore index on a table makes the table non-updateable, meaning that bulk load and update DML operations on the table will fail, you can still add data to a table with a columnstore index with partition management operations. This can make your ETL process run faster because the alternative is to drop the columnstore index, do your load or update, and then rebuild the index. Since older data is normally not updated, this can result in losing the work required to index older data, and then having to redo it. The following example shows how to add data using partition switching.
/*
This example shows how to add data to a partitioned table with a columnstore index
using partition switching operations, in a way that can be repeated every time new
data needs to be added, assuming data is added once a day.
*/
use tempdb;
go
-- Clean up objects left behind from prior runs.
if object_id('Fact', 'U') is not null
drop table Fact;
if object_id('Staging', 'U') is not null
drop table Staging;
if exists (select * from sys.partition_schemes where name = 'ps')
drop partition scheme ps;
if exists (select * from sys.partition_functions where name = 'pf')
drop partition function pf;
go
-- Create a partition function with 3 partitions, and an associated partition scheme.
create partition function pf (int) as range left for values (20111012, 20111013);
create partition scheme ps as partition pf all to ([PRIMARY]);
go
-- Create a partitioned table.
create table Fact (date_key int not null, measure int)
on ps(date_key);
go
-- Add some data, leaving an empty partition on each end.
insert Fact values(20111013, 10);
go
-- Create a partition-aligned columnstore index on the table.
create nonclustered columnstore index ncci on Fact(date_key, measure);
go
-- Show the data and the partition number of each row.
select $partition.pf(date_key) as partition_number, *
from Fact;
go
-- Split the uppermost partition (which is now empty, so this is a fast, metadata-only operation).
alter partition function pf() split range (20111014);
go
-- Create a staging table with the same schema as the partitioned table,
-- to hold data to be appended to it via a partition swtich.
create table Staging(date_key int not null, measure int);
go
-- Create a constraint so that the staging table has the same bounds as the newly created partition.
alter table Staging with check add constraint bounds check (date_key > 20111013 and date_key <= 20111014) ;
go
-- Add data to the staging table.
insert Staging values (20111014, 60);
go
-- Create a columnstore on the staging table with the same columns in it as the
-- columnstore index on the target partitioned table.
create nonclustered columnstore index ncci on Staging(date_key, measure);
go
-- Switch the staging table into the proper empty partition of the target table.
-- Afterwards, an empty partition remains as the rightmost partition,
-- so it can be split later as a metadata-only operation.
declare @p int = $partition.pf(20111014);
alter table Staging switch to Fact partition @p;
go
-- Retrieve the rows in the partitioned table, showing the partition of each row.
select $partition.pf(date_key) as partition_number, *
from Fact;
go
Existing techniques and guidelines for using SQL Server table partitioning are documented in other places, and in general they still apply with columnstore indexes. Below are links to other partitioning resources:
- Partitioned Tables and Indexes in SQL Server 2005
- Partitioned Table and Index Strategies Using SQL Server 2008