Performance of a Query on Computed Column can degrade in Sql 2005 with the increase in complexity of function in computed column
With Sql server 2000 and 2005 computed columns can be use to improve the performance of queries since without the computed columns, indexes cannot be used as explained in the following blog by one of our Escalation Engineer Jack Li.
However in some cases, you may observe that a query on computed column running on SQL 2005 instance is performing much slower as compared to the same query running on SQL 2000 instance. In this blog I will be explaining one of the possible cause of the slowness of query on computed column in SQL 2005 instance with the help of following example.
The following script is used to create a table with a computed column which has an index created on it.
use tempdb
GO
set ansi_nulls
, ansi_padding
, ansi_warnings
, concat_null_yields_null
, quoted_identifier on
set numeric_roundabort off
GO
if object_id ( 'dbo.test') is not null drop table dbo.test
GO
create table dbo.test
(
id int identity ( 1, 1) not null
, col char (7000) not null
, computed_col as (checksum(isnull(col, '')))
, constraint test_pk primary key clustered( id)
)
GO
declare @i int
set @i=1000
while (@i<20000)
begin
insert into test(col) values(@i);
set @i=@i+100
end
create index ix_table
on dbo.test (computed_col)
use tempdb
go
set ANSI_NULLS, ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, quoted_identifier, ARITHABORT ON
set NUMERIC_ROUNDABORT OFF
go
set statistics io on
set statistics profile on
set statistics time on
go
select id,col,computed_col
from dbo.test
where computed_col = 580804
set statistics io off
set statistics profile off
set statistics time off
When we run the above query on the Sql 2000 instance and SQL 2005 instance we observe that query performs slow on SQL 2005 instance as compared to SQL 2000 instance. This is because of the difference in the execution plan in SQL 2000 and SQL 2005
SQL 2000
========
Table 'test'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0.
StmtText
-------------------------------------------------------------------------------------------------------------------------
SELECT [id]=[id],[col]=[col],[computed_col]=[computed_col] FROM [dbo].[test] WHERE [computed_col]=@1
|--Compute Scalar(DEFINE:([test].[computed_col]=checksum(isnull([test].[col], Convert('')))))
|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([tempdb].[dbo].[test]))
|--Index Seek(OBJECT:([tempdb].[dbo].[test].[ix_table]), SEEK:([test].[computed_col]=580804) ORDERED FORWARD)
SQL 2005
========
Table 'test'. Scan count 1, logical reads 192, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
StmtText
---------------------------------------------------------------------------------------------------------------------------------
SELECT [id],[col],[computed_col] FROM [dbo].[test] WHERE [computed_col]=@1
|--Compute Scalar(DEFINE:([tempdb].[dbo].[test].[computed_col]=checksum([tempdb].[dbo].[test].[col])))
|--Clustered Index Scan(OBJECT:([tempdb].[dbo].[test].[test_pk]), WHERE:(checksum([tempdb].[dbo].[test].[col])=(580804)))
As seen in the above execution plans, in sql 2000 instance the query performs an Index seek on the index on computed column and thereby causes only 3 logical reads whereas SQL 2005 performs a Clustered Index Scan neglecting the index on computed column.
The difference in the execution plan is attributed to the difference in the behavior of the Algebrizer and Optimizer in SQL 2000 and SQL 2005 instance when dealing with computed column. As seen in the above execution plan, in sql 2005 the Algebrizer expands the computed column to checksum(isnull([tempdb].[dbo].[test].[col]) which is further simplified to checksum([tempdb].[dbo].[test].[col])=(580804) in the optimization phase.
In the above example, in SQL 2005 the Algebrizer substitutes the computed_col with checksum(isnull(col, '')) and it is further simplified to checksum([tempdb].[dbo].[test].[col])=(580804) during the optimization phase. When the optimizer generates the execution plan it checks whether there is any index on checksum([tempdb].[dbo].[test].[col]) however the index exists on the checksum(isnull([tempdb].[dbo].[test].[col]))and hence the optimizer is unable to detect the index on computed columns and chooses to perform the clustered index scan.
However when we remove the isnull function from the computed column we observe that sql 2005 instance chooses the same plan as that of sql 2000 instance which is desirable.
use tempdb
GO
set ansi_nulls
, ansi_padding
, ansi_warnings
, concat_null_yields_null
, quoted_identifier on
set numeric_roundabort off
GO
if object_id ( 'dbo.test') is not null drop table dbo.test
GO
create table dbo.test
(
id int identity ( 1, 1) not null
, col char (7000) not null
, computed_col as (checksum(col))
, constraint test_pk primary key clustered( id)
)
GO
SQL 2005
=========
Table 'test1'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
StmtText
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
SELECT [id],[col],[computed_col] FROM [dbo].[test1] WHERE [computed_col]=@1
|--Compute Scalar(DEFINE:([tempdb].[dbo].[test1].[computed_col]=[tempdb].[dbo].[test1].[computed_col]))
|--Nested Loops(Inner Join, OUTER REFERENCES:([tempdb].[dbo].[test1].[id]))
|--Index Seek(OBJECT:([tempdb].[dbo].[test1].[ix_table1]), SEEK:([tempdb].[dbo].[test1].[computed_col]=(580804)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([tempdb].[dbo].[test1].[test_pk1]), SEEK:([tempdb].[dbo].[test1].[id]=[tempdb].[dbo].[test1].[id]) LOOKUP ORDERED FORWARD)
Alternatively if we remove the "not null" for the column
So the workaround for such cases is to simplify the functions used for computed columns
Parikshit Savjani
SE, Microsoft Sql Server
&
Levi Justus
Technical Lead, Microsoft Sql Server
Reviewed By
Amit Banerjee
Technical Lead, Microsoft Sql Server
Comments
Anonymous
April 14, 2009
PingBack from http://microsoft-sharepoint.simplynetdev.com/performance-of-a-query-on-computed-column-can-degrade-in-sql-2005-with-the-increase-in-complexity-of-function-in-computed-column/Anonymous
April 15, 2009
Thank you for submitting this cool story - Trackback from DotNetShoutout