Partition Elimination in SQL Server 2005
Lubor Kollar
Partition elimination is very important when SQL Server executes queries against partitioned tables or partitioned views. In general, SQL Server is doing an excellent job of not scanning the partitions that are excluded by some predicates. Recently we have discovered one scenario where the partition elimination does not work against partitioned tables in SQL Server 2005 and this blog describes the conditions leading to the problem as well as easy workarounds. Additionally you will learn how to discover if partition elimination works for your query or not. You will also learn what is static and dynamic partition elimination.
The most reliable way to find out if partition elimination happens in your query is to use the SET STATISTICS PROFILE ON command, run the query and investigate the output. But let me start with building our example table:
create partition function PF1 (int) as range for values (100,200,300,400);
create partition scheme PS1 as partition PF1 all to ([PRIMARY]);
go
create table t1 (a int, b int) on PS1 (a);
go
declare @i int;
set @i=1;
set nocount on;
while (@i<22)
begin;
insert into t1 values (20*@i, @i);
set @i=@i+1;
end;
go
The following query shows distribution of all rows in table t1 across the five partitions:
select $partition.PF1(a) [Partition Number], a, b from t1
Partition Number a b
1 20 1
1 40 2
1 60 3
1 80 4
1 100 5
2 120 6
2 140 7
2 160 8
2 180 9
2 200 10
3 220 11
3 240 12
3 260 13
3 280 14
3 300 15
4 320 16
4 340 17
4 360 18
4 380 19
4 400 20
5 420 21
First, I will show 5 examples how partition elimination works correctly in SQL Server 2005 and I will explain the difference between the static and dynamic partition elimination. Here is a small batch and we will investigate the output later below
set statistics profile on;
declare @i1 int;
declare @i2 int;
set @i1=50;
set @i2=250 ;
select * from t1 where a<50 or a>450; -- (Q1) only two partitions are scanned
select * from t1 where a in (50,450); -- (Q2) only two partitions are scanned
select * from t1 where a<@i2 and a>100; -- (Q3) only two partitions are scanned
select * from t1 where a=100;-- (Q4) only one partition is scanned - static partition elimination
select * from t1 where a=@i2; -- (Q5) only one partition is scanned - dynamic partition elimination
set statistics profile off;
You will see the result set followed by the showplan with the columns “Rows” and “Executes” in front of it for each of the four queries above. For the query Q1
select * from t1 where a<50 or a>450
the showplan output is
Rows Executes StmtText
2 1 select * from t1 where a<50 or a>450; -- (Q1) only two partitions
2 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PAR
2 1 |--Constant Scan(VALUES:(((1)),((5))))
2 2 |--Table Scan(OBJECT:([test].[dbo].[t1]), WHERE:([test].[d
The scan of the partitioned table becomes a Nested Loops looping over the partitions. And we see already in the Constant Scan that we will be scanning only partitions 1 and 5. The “Executes” value 2 below confirms we did 2 scans of an individual partition.
The IN predicate “a in (50,450)” in the Q2 is turned into “ a = 50 OR a = 450”, and SQL Server will access only the two partitions, 1, and 5, containing all qualifying rows
Rows Executes StmtText
0 1 select * from t1 where a in (50,450); -- (Q2) only two partitions are sc
0 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PAR
2 1 |--Constant Scan(VALUES:(((1)),((5))))
0 2 |--Table Scan(OBJECT:([test].[dbo].[t1]), WHERE:([test].[dbo].[t
The second query, Q3,
select * from t1 where a<@i2 and a>100
yields
Rows Executes StmtText
7 1 select * from t1 where a<@i2 and a>100; -- (Q3) only two partitions
7 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PAR
2 1 |--Filter(WHERE:([PtnIds1004]<=RangePartitionNew([@i2],(0)
4 1 | |--Constant Scan(VALUES:(((2)),((3)),((4)),((5))))
7 2 |--Table Scan(OBJECT:([test].[dbo].[t1]), WHERE:([test].[d
In the above plan we see that we have statically eliminated the partition 1 from the Constant Scan because of the a>100 predicate and we are using the predicate a<@i2 to potentially eliminate more partitions using the Filter above the Constant Scan. The later is dynamic elimination because it depends on the run t
Comments
Anonymous
March 15, 2006
The comment has been removedAnonymous
March 23, 2006
great article, but what puzzeles me is
i have manged to create the following repro that demonstrates a partition elimination problem with 'OR'
and there is NO paramtrization in the queries ,this repro is from our production environment
and the partitioned table actually contains 100 million rows ,but the query plans are the same even with empty table,the repro works both on RTM and SP1 MARCH CTP
/****** Object: PartitionFunction [YearRange] Script Date: 03/23/2006 16:36:25 /
CREATE PARTITION FUNCTION YearRange
AS RANGE RIGHT FOR VALUES (1995, 1996, 1997, 1998, 1999, 2000,
2001, 2002, 2003, 2004, 2005, 2006, 2007)
GO
/ Object: PartitionScheme [YEARS_PS_SHR] Script Date: 03/23/2006 16:36:01 /
CREATE PARTITION SCHEME [YEARS_PS_SHR]
AS PARTITION [YearRange]
ALL TO ([PRIMARY])
GO
/ Object: Table [dbo].[bth_fact] Script Date: 03/23/2006 16:39:36 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[bth_fact](
[rowid] [bigint] NOT NULL,
[teudat_zehut_src] varchar COLLATE Hebrew_CI_AS NOT NULL,
[teudat_zehut] varchar(10) NOT NULL,
[kod_beit_ch] [int] NOT NULL,
[kod_kupa] [tinyint] NOT NULL,
[kod_peula_bdika] varchar COLLATE Hebrew_CI_AS NOT NULL,
[kod_peula_old] varchar COLLATE Hebrew_CI_AS NULL,
[kod_sug_sherut] [tinyint] NOT NULL,
[kod_machl_mirp] [int] NOT NULL,
[shnat_chiuv]smallint NOT NULL,
[chodesh_chiuv] tinyint NOT NULL,
[rivon_chiuv] tinyint NOT NULL,
[date_knisa] [smalldatetime] NOT NULL,
[date_shichrur] [smalldatetime] NULL,
[kmt_yamim_tipulim] [smallint] NOT NULL,
[alut] [decimal](11, 2) NOT NULL,
[tofes17] varchar COLLATE Hebrew_CI_AS NULL,
[mispar_ishp_bikur] [bigint] NOT NULL,
[kod_machl_mirp_src] varchar COLLATE Hebrew_CI_AS NULL,
[zihui_ishpuz_bikur] [bigint] NOT NULL,
[kod_sug_kabala] [tinyint] NULL,
[kod_sug_knisa] [tinyint] NULL,
[kod_sug_shichrur] [tinyint] NOT NULL,
[kod_sug_chole] char COLLATE Hebrew_CI_AS NOT NULL,
[kod_sug_cheshbon] [tinyint] NOT NULL,
[kod_kart_taktziv] [tinyint] NOT NULL,
[kod_mosad_memamen] [smallint] NOT NULL,
[gil] [smallint] NOT NULL,
[kod_machoz_beit_ch] tinyint NOT NULL,
[kod_seif_tkz_new] [int] NOT NULL,
[kod_seif_taktzivi] [int] NOT NULL,
[alut_hanacha] [decimal](11, 2) NOT NULL,
[mispar_peul_diff] [smallint] NOT NULL,
[hist_gk] [int] NOT NULL,
[shem_prati] varchar COLLATE Hebrew_CI_AS NULL,
[shem_mishpacha] varchar COLLATE Hebrew_CI_AS NULL,
[kod_kart_tkz_new] [tinyint] NOT NULL,
[kod_machoz_mevu] tinyint NOT NULL,
[kod_peula_bdk_src] varchar COLLATE Hebrew_CI_AS NULL,
[kod_sug_tz] [tinyint] NOT NULL,
[sw_global] char COLLATE Hebrew_CI_AS NOT NULL,
[minun] [int] NULL,
[peula_mb_gk] [int] NOT NULL,
[kod_bth_simul] [int] NOT NULL,
[kod_source] [tinyint] NOT NULL,
[shnat_knisa] [smallint] NOT NULL,
[kod_machoz_memamen] [tinyint] NOT NULL,
[kod_makor] [tinyint] NOT NULL,
CONSTRAINT [PK_bth_fact] PRIMARY KEY NONCLUSTERED
(
[shnat_chiuv] ASC,
[chodesh_chiuv] ASC,
[rowid] ASC
)WITH (IGNORE_DUP_KEY = OFF) ON YEARS_PS_SHR
) ON YEARS_PS_SHR
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[bth_fact] WITH CHECK ADD CONSTRAINT [CK_bth_fact_KODS]
CHECK (([kod_source]=(1)))
-------------------
---QUERY:
---1.good plan
SET STATISTICS PROFILE ON
select teudat_zehut,alut_hanacha
from
dbo.bth_fact bth
where
bth.shnat_chiuv in (2003,2005)
---showplan outpt
0 1 select teudat_zehut,alut_hanacha from dbo.bth_fact bth where bth.shnat_chiuv in (2003,2005) 1 1 0 NULL NULL NULL NULL 1 NULL NULL NULL 0.006573417 NULL NULL SELECT 0 NULL
0 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1003]) PARTITION ID:([PtnIds1003])) 1 2 1 Nested Loops Inner Join OUTER REFERENCES:([PtnIds1003]) PARTITION ID:([PtnIds1003]) NULL 1 0 4.18E-06 25 0.006573417 [bth].[teudat_zehut], [bth].[alut_hanacha] NULL PLAN_ROW 0 1
2 1 |--Constant Scan(VALUES:(((10)),((12)))) 1 3 2 Constant Scan Constant Scan VALUES:(((10)),((12))) NULL 2 0 2.157E-06 11 2.157E-06 [PtnIds1003] NULL PLAN_ROW 0 1
0 2 |--Table Scan(OBJECT:([AdventureWorks].[dbo].[bth_fact] AS [bth]), WHERE:([AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2003) OR [AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2005)) PARTITION ID:([PtnIds1003])) 1 6 2 Table Scan Table Scan OBJECT:([AdventureWorks].[dbo].[bth_fact] AS [bth]), WHERE:([AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2003) OR [AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2005)) PARTITION ID:([PtnIds1003]) [bth].[teudat_zehut], [bth].[shnat_chiuv], [bth].[alut_hanacha] 1 0.003125 0.0001581 27 0.0034412 [bth].[teudat_zehut], [bth].[shnat_chiuv], [bth].[alut_hanacha] NULL PLAN_ROW 0 2
---2. bad plan
select teudat_zehut,alut_hanacha
from
dbo.bth_fact bth
where
(bth.shnat_chiuv =2003 and bth.chodesh_chiuv in (9,10,11))
or (bth.shnat_chiuv = 2005 )
---showplan outpt
0 1 select teudat_zehut,alut_hanacha from dbo.bth_fact bth where (bth.shnat_chiuv =2003 and bth.chodesh_chiuv in (9,10,11)) or (bth.shnat_chiuv = 2005 ) 1 1 0 NULL NULL NULL NULL 1 NULL NULL NULL 0.04802274 NULL NULL SELECT 0 NULL
0 1 |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1003]) PARTITION ID:([PtnIds1003])) 1 2 1 Nested Loops Inner Join OUTER REFERENCES:([PtnIds1003]) PARTITION ID:([PtnIds1003]) NULL 1 0 4.18E-06 28 0.04802274 [bth].[teudat_zehut], [bth].[alut_hanacha] NULL PLAN_ROW 0 1
14 1 |--Constant Scan(VALUES:(((1)),((2)),((3)),((4)),((5)),((6)),((7)),((8)),((9)),((10)),((11)),((12)),((13)),((14)))) 1 3 2 Constant Scan Constant Scan VALUES:(((1)),((2)),((3)),((4)),((5)),((6)),((7)),((8)),((9)),((10)),((11)),((12)),((13)),((14))) NULL 14 0 1.4157E-05 11 1.4157E-05 [PtnIds1003] NULL PLAN_ROW 0 1
0 14 |--Table Scan(OBJECT:([AdventureWorks].[dbo].[bth_fact] AS [bth]), WHERE:([AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2003) AND ([AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(9) OR [AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(10) OR [AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(11)) OR [AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2005)) PARTITION ID:([PtnIds1003])) 1 18 2 Table Scan Table Scan OBJECT:([AdventureWorks].[dbo].[bth_fact] AS [bth]), WHERE:([AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2003) AND ([AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(9) OR [AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(10) OR [AdventureWorks].[dbo].[bth_fact].[chodesh_chiuv] as [bth].[chodesh_chiuv]=(11)) OR [AdventureWorks].[dbo].[bth_fact].[shnat_chiuv] as [bth].[shnat_chiuv]=(2005)) PARTITION ID:([PtnIds1003]) [bth].[teudat_zehut], [bth].[alut_hanacha] 1 0.003125 0.0003038857 28 0.0073794 [bth].[teudat_zehut], [bth].[alut_hanacha] NULL PLAN_ROW 0 14
Any Idea why tis happens ?
Danny Ravid
Ness Technologies
danny.ravid@ness.comAnonymous
March 26, 2006
The comment has been removedAnonymous
March 26, 2006
Answer to Danny:
Danny, thanks a lot for all the details. You are correct - this is another case where SQL Server 2005 does not perform the partition elimination. Luckily the same UNION ALL trick described in my blog works again. So please rewrite your quey as follows:
select teudat_zehut,alut_hanacha
from
dbo.bth_fact bth
where
(bth.shnat_chiuv =2003 and bth.chodesh_chiuv in (9,10,11))
UNION ALL
select teudat_zehut,alut_hanacha
from
dbo.bth_fact bth
where
(bth.shnat_chiuv = 2005 )Anonymous
November 27, 2006
In this post, I’m going to take a look at how query plans involving partitioned tables work. Note thatAnonymous
February 07, 2007
Hi, Here in Brazil I´m working on MS Gold Certified Partner, and I act in two distinct teams - Data...Anonymous
April 18, 2007
Hi, Great aritical, though do you happen to know why the optimizer it NOT grabbing the correct partition when a @varible is used verses a direct value. funny thing is the plan doesn't change between estimated verses actual( you would think it would pick up the correct partition after the exection) in addtional if this actually by design out can you verify that the optimizer really is pulling the correct partition.Anonymous
April 18, 2007
The comment has been removedAnonymous
October 03, 2007
Lubor: declare @i1 int; set statistics profile on; set @i1=120; select * from t1 where a < @i1; Any reason why the the “Executes” value is 2 here, and not 1? 120 is the first value in Partition#2, thus <120 should only be Partition#1... Thanks for the post, it has provided some much needed direction with Partitioned Tables. ....Anonymous
October 04, 2007
I'm not sure what partition function you have in mind... In the one I have in the blog the first boundary value is 100 thus precicate <120 must scan the first two partitions. Or post your Partition Function, Schema, and table to see what you exacly mean.Anonymous
October 04, 2007
Ok, I see it now. Its the boundary, now the rows which satisfy. Partition Number a b 1 100 5 2 120 6 I was thinking (errantly) thinking that because there were only rows which satisfied the <120 condition, it would only scan Part1. But now I see it has to scan Part2 for 101-119 as well (My conditition overlapping with the boundary definition). Thanks for the followup.Anonymous
October 04, 2007
I think it would be interested to see the break down of these types of situations: set statistics profile on; declare @holder table ( hValue int ) insert into @holder (hValue) values ( 20 ) insert into @holder (hValue) values ( 40 ) insert into @holder (hValue) values ( 60 ) --Derived Table Example select a, b from t1 join ( select hValue as HV from @holder h ) as derived1 on t1.a = derived1.HV --EXISTS Example select a, b from t1 tOne where EXISTS ( select null from @holder h where h.hValue = tOne.a) I'm still using your examples to learn how to fine-tune my eye for partition elimination.Anonymous
October 04, 2007
Man, what would be nice is something like: set partitions profile on; --<this does not exist And then it would just show you: Table PartitionTouches t1 1 Because it gets cumbersome/confusing sometimes. I guess I can dream. But thank you again for the post...I'd be lost/frustrated without it.Anonymous
January 15, 2008
Folks, I have VLDB on SQL2000. Our primary table contains slightly less than 1B records, and I am anticipating 450M records per year. We have implemented distributed view on 2000, and it works reasonably well. As we are migrating to SQL2005, I am not certain of the maturity of partitioning to migrate to that architecture, as well as having everything on a single server. My first reaction is to architect a hybrid solution, where I would use partitioning on the local servers, and then have a distributed view on top of all the partitions. Any thoughts would be appreciated.Anonymous
January 25, 2008
Hi, I'm Jungsun Kim, SQL Server MVP in Korea. This is good article, I think. (of course, too old. ^^) I would like to introduce this article to my buddies. so I just translated to Korean and posted in my blog, "http://blog.naver.com/visualdb/50027176087" Thank you.Anonymous
January 25, 2008
Answer to clientserverarch: Your approach combining the partitioned tables and partitioned view will work. Before you implement the solution make sure you know WHY you want to partition. Table partitioning is usually introduced for manageability reasons, ability to load data w/out affecting the table access and ability to delete whole partition very fast. Sometimes users parition to enable index rebuild on a single partition or break down the backup/restore process. You should start with clear idea what you want to use it for and go from there. I would also first test it w/out the partitioned view to make sure that you are getting what you hoped for.Anonymous
January 25, 2008
Jungsun Kim, thanks a lot for translating to Korean. It is not late at all and we have many customers in Korea - here you can see some: http://www.microsoft.com/casestudies/search.aspx?Keywords=korea&ProTaxID=1273Anonymous
January 28, 2008
LuborK, Thanks for your reply. I am planning to distribute the data using both methods, separating the data out by server (perhaps 1-3 servers a year), and further separate out by month into different partitions. I am concerned that MS is going to remove the distributed view functionality in future MSSQL releases, and then I will be back to consolidating data. Thoughts?Anonymous
January 28, 2008
There is currently no plan to remove the support for distributed partitioned views in the future releases. That said I cannot ensure it will be supported forever, but similarly I cannot say it about SELECT statement :-).Anonymous
June 20, 2008
Is there a plan to release a patch for this issue for SQL Server 2005 or it will be fixed in SQL Server 2008 only? I hope Microsoft is planning to address this issue for SQL Server 2005 because partition elimination not working is the only reason we are not using partitioning in our environment. Unfortunately we cannot use the workarounds suggested in this article because we do not have control over the structure of ad hoc queries generated in our environment.Anonymous
June 23, 2008
Kangana, it would be great if you could let us know what kind of queries you are looking for to eliminate partitions. Also please let us know what are the partitioning columns, clustered indexes. Then I can answer your question.Anonymous
June 25, 2008
Thanks for the response Lubor. We have a data warehouse environment. The query in its simplest form is: SELECT ColA from FactTableA WHERE ColA in (select ColB from DimensionTableB where ColC = somevalue ) FactTableA is partitioned on ColA but partition elimination does not work. ColA and ColB are data type int and the subquery returns list of integers. ColC is data type date. FactTableA has a identity int column as its primary and clustered index key. There is a nonclustered index on ColA. This query is generated through a third-party reporting tool Microstrategy. User inputs value of 'somevalue' in a report, SQL query is generated by the reporting tool and then sent to the SQL Server thus resulting in no control over structure of query.Anonymous
June 25, 2008
Sorry, query would be more like: SELECT ColA, AmountA, AmountB, AmountC from FactTableA WHERE ColA in (select ColB from DimensionTableB where ColC = somevalue )Anonymous
June 29, 2008
I'm not sure I understand how is your table decleared (with partitioning, uniquness, indexes, etc.). If ColA is the partitioned column and some other column is unique key, then there must be some other index on the unique column that is not partitioned on ColA. If I ignore the above uncertainity and I only assume that the table is partitioned on ColA then we can use several different plans for this query - some of them visiting all partitions some of them not. The following things would encourage using index lookup into the fact table thus avoiding searching all partitions:
- create index on ColA and name the additional columns AmountA, AmountB, AmountC as INCLUDE columns in the index OR
- create clustered index on ColA But in both cases optimizer may still not choose the "seeks" if it assumes that there exist many rows qualified by the subquery (select ColB from DimensionTableB where ColC = somevalue )in the fact table.
- Anonymous
June 30, 2008
The comment has been removed