Condividi tramite


Partitioning… some tricks and hints to create the switch in and out table

Partitioning of tables in SQL Server is commonly used since the transparent partitioning was introduced in SQL 2005. No longer need for union all views and stuff like that, just insert, update, delete on a table without the need to remember how the table was partitioned over hundreds of single tables.

I did a lot of partitioning, in DWH scenarios (to load data in and out, to get rid of old data, effective queries with partition elimination) and in OLTP (old data out without blocking of a delete query, partition elimination).

With column store indexes in SQL 2012 you need to do partitioning to get data into the table if the columnstore index is already there. No insert, update, delete possible with activated column store index.

In my last project, an OLTP like DWH, I partitioned 25 tables, just for the purpose to get rid of old data, and to use partition elimination when querying the data and processing the OLAP cubes on top of the data.

When you want to snap a Partition out (and also into a table) you need to have a table with the exactly same structure as the partitioned table, and it must be in the same filegroup as the partition you want to work with. Most people use the ‘select top 0 * from dbo.foo into dbo.foo_copy’. This works, but try this with data compression or when your table just has calculated columns. It will fail.

So, OK, other idea (and what I used mostly in the past): I create a table dbo.PartitionConfig, which has a row for each partitioned table, and columns like ‘tablename’, ‘create_table_statement’, ‘PK_statement’, 'Index_statement’, ‘drop_statement’ and so on. With this I can set the commands configurable and in my partitioning code I just get the data from the table, nothing hardcoded. Dynamic SQL with the commands, and go. Works fine.

But with this the next problem came up: imagine you have different systems (Dev, Test1, Test2, ProdTest and Production). There will be the time, when you need to do an ‘alter table’ to a partitioned table with data. No problem, this works, but alter table adds the column at the right end of the table. If you support ‘full builds’ and ‘update builds’ in your building environment there will be a time, when the create table statement for the ‘full build’ will have a different column order than the ‘update build’ based on alter table commands. Believe me, in a project with 5 or more SQL Dev guys this will happen :-).

What is the problem with this? When switching out or in data in partitioning SQL Server does not care about the column names, so being in wrong order will be a problem. My config table was filled with a SQL script, and I was not willing to do a ‘new build’ and an ‘update build’ script just for the column order.

So this was the point to do something new. So I wrote some SP code to read the metadata from the partitioned table, do a ‘create table’ statement with this, find out the primary key, the compression state of the partition (yes, true, you can compress a partition each for its own, with difference in page or row compression). And, find out about calculated columns!

For the ease of this example I just do not use a partitioned table here, my code also works without partitions. Why this works? Well, also a not partitioned table is a table which consists of one partition, the whole table :-).

Attached you will find the Procs and Scripts to setup the Scenario, they are commented inside the code. Here are only the calls and what happens.

This is my table example:

CREATE
TABLE PartTest

(      ID INT NOT NULL,

       DayDate DATE NOT NULL,

       DayTime TIME(0) NOT NULL,

       AnyValue decimal(10,2) NOT NULL,

       AnyString VARCHAR(255) NOT NULL,

       CalcColumn AS 5 * ID,

       CONSTRAINT PK_PartTest PRIMARY KEY (ID, DayDate)

) WITH (DATA_COMPRESSION = ROW)

GO

First of all, the copy of the table already exists, so I have a procedure to drop it.

EXEC [dbo].[usp_DropSwitchOutTable] 'dbo.PartTest_Out'

Then, just create the copy using the stored procedure. The first parameter is the source table, the second one is the name of the table I want to create as a copy, the third is the filegroup in which the copy table should be created (it must be the same as the source table when switch in and out of partitions should work, as already mentioned).

EXEC [dbo].[usp_CreateSwitchOutTable] 'dbo.PartTest', 'dbo.PartTest_Out', 'PRIMARY'

Check out your database, it now has a second table, it should have the same structure as the source table and is in the filegroup you specified. How do I find out in which filegroup my
partition is, if I have a table with partitions spread over multiple groups? You find out in the metadata, just check my attached function.

dbo.usf_GetFilegroupForPartition

Parameters are Name of the partition schema, and the Partition number. How to get this number, see later in this article. How to get the partition schema? Oh, just use my attached function

dbo.usf_GetPartitionSchemaForTable

All you need is the table name in the form of schema.tablename.

Next step is to add the primary key to the copy table.

EXEC [dbo].[usp_CreatePrimaryKeySwitchOutTable] 'dbo.PartTest', 'dbo.PartTest_Out'

Check it out, there will be a PK now on the copy.

Finally, find out the compression and apply it to the copy table.

EXEC [dbo].[usp_SetSwitchOutTableCompression] 'dbo.PartTest', 'dbo.PartTest_Out', 1

Check out your copy table, it will now have row compression. The third parameter is the partition number, without partitions in the table this is 1, if you want to use a partition inside your table you must find out the number from the metadata. The easiest way to do so is the $partition metadata function, you just need to know the name of the partition function of your table, and by which criteria it is partitioned, so if it is by day, just throw in the date, and you get the Partition ID or number in which the data for that day is in. Check out https://msdn.microsoft.com/de-de/library/ms188071.aspx for the details of the function.

If you do not know the name of the partition function of your table, use my attached function.

dbo.usf_GetPartitionFunctionForTable

All you need is the table name in the form of schema.tablename.

So, this is all the magic.

Do a test on your own, just add a column to the table with alter table, and run the code again. The copy table should have the row added also. Or try to create the table in a filegroup other than PRIMARY. Or, much better, setup a partitioned table and use my code.

What to do with the created copy of the table? Well, just identity the partition you want to switch out of your partitioned table and use the alter table switch Partition statement to move the data out, drop the copy table, and the data is gone. Don’t forget to so a merge on your partitioned table afterwards to get rid of the ‘hole’ in your table.

Partitioning.zip