How to use computed columns to improve query performance
In general, you should avoid applying a scalar function on a column when comparing against another column, variable or constant.
Let's use an example. Frequently, we got cases from customers who are not aware of performance implications and do just that. If you have query like below. It poses challanges to SQL Server optimizer. It can't use index on c1 to do any seeks.
select * from t where lower(c1) = 'az'.
One natural solution is to avoid applying function lower. But what if the column is case senstive? Then the result won't be correct if you drop the 'lower' function.
there is another solution. The solution is to create another computed column on t (c2 as lower(c1)). Then create an index on c2.
now the above query will use index defined on c2 to do seeks. Below is a complete example to demonstrate the technique. There are serveral requirements on creating indexes on computed columns. See books online "Creating Indexes on Computed Columns" section for details
use tempdb
go
drop table t
go
create table t (c1 varchar(50) collate Latin1_General_CS_AS)
go
set nocount on
declare @i int
set @i = 0
while @i < 10000
begin
declare @ch varchar(10)
set @ch = cast (@i as varchar(10))
insert into t (c1) values (@ch)
set @i = @i + 1
end
go
create index indx_c1 on t(c1)
go
update statistics t with fullscan
go
set statistics profile on
go
--note that this query does table scan because of the lower function applied
select * from t where lower(c1) = '00'
go
set statistics profile off
go
--let's add a computed column and create index on this computed column
alter table t add c2 as lower(c1)
go
create index indx_c2 on t(c2)
go
set statistics profile on
go
--note that this query does index seek
select * from t where lower(c1) = '00'
go
set statistics profile off
go
Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support
Comments
Anonymous
March 08, 2009
PingBack from http://www.clickandsolve.com/?p=19692Anonymous
March 12, 2009
Hi Jack, I use that technique, but hit a problem with that in a database that was migrated from sql2000 to sql2005 (SP2 CU9). Database have few tables with similar structure:
- Identity field as clustered PK
- Several fields (names, parameters, etc), some are nullable, some are not.
- a computed field (a checksum over other fields in the table except PK). That field has only unique values. There is an index on that computed field. The table is queried with following criteria: where computedField=checksum(...) and ((fieldA is null and @inputParameterA is null) or (fieldA=@inputParameterA)) and ((fieldB is null and @inputParameterB is null) or (fieldB=@inputParameterB)) ... {all the fields required} In sql2000 the index on the computed field is used as it is supposed to be. When the database was attached to sql2005 the index is ignored in the same query, PK clustered index is scanned, performance is down of course. Statistics was updated, procedures recompiled afterwards - didn't help. No changes were made to the database during migration. Tried compatibility level 80 and 90 - didn't help. What could be a solution? Thanks, Alexander
Anonymous
March 12, 2009
UPD: The only thing that helped was to make clustered that index on the computed column. But still - I'm not comfortable to make it clustered for some reason.Anonymous
March 16, 2009
There could have been a better example to use than Lower(), as Lower() can be handled easier by using a case insensitive collation sequence.Anonymous
April 14, 2009
With Sql server 2000 and 2005 computed columns can be use to improve the performance of queries sinceAnonymous
May 12, 2009
Sample does not work with SQL 2000?