Partilhar via


Predicate ordering is not guaranteed

A typical programmer may expect that the predicates are always evaluated in the order that they are specified, but this is not true in database systems.

For example, for the following clause,

 where col11 = 5 and convert(int, col2) = 100

programmers may think col11 = 5 is always evaluated first, but this is not guaranteed. Query Optimizer may choose to first evaluate the second predicate, convert(int, col2) = 100. In fact, predicates may be pushed down the query tree, and not evaluated at the same time.

Such reordering is generally benign, as the row has to satisfy both conditions to qualify. But it may make a difference if the second predicate would cause an error if the first condition is not met. The following example demonstrates this.

Here we have a table with category and value both as string columns. From business logic, we know that when category is 'ID', value always contains a string with an integer ID. Therefore, we create a view, ID_View, that only shows the rows with category 'ID', and in the view, we will convert the value column to integer, and project as ID column.

-- Create table
create table dbo.test(
id int not null primary key,
category varchar(30) not null,
value varchar(50) not null,
name varchar(30))
go

-- Populate this table with data
-- The logic is that when category is 'ID', value is a string of integer ID
insert into dbo.test
values (1, 'Text', 'Hello world', 'foo')
insert into dbo.test
values (2, 'ID', '123', 'bar')
go
 
-- Create a view that only shows the rows that has category 'ID',
-- and in that case, convert the value to an integer ID
create view dbo.ID_view as
select convert(int, value) ID, name from dbo.test
where category = 'ID'
go

-- this will show the rows where category is 'ID'
select * from dbo.ID_view
go

-- this will cause an error
select * from dbo.ID_view
where ID = 123
go

Selecting all rows from the view works perfectly fine. However, when we try to select from the view where ID is a certain value, 123, we get an runtime error:

Server: Msg 245, Level 16, State 1, Line 1
Syntax error converting the varchar value 'Hello world' to a column of data type int.

The reason is that ID = 123 is expanded into convert(int, value) = 123, and it gets evaluated before the predicate category = 'ID'. Therefore, for the first row, with category as 'Text', and value as 'Hellow world', the conversion from value to int caused the runtime error.

In this particular example, there is no advantage of evaluating which predicate first, so Query Optimizer could choose either order. In other cases, it may be more efficient to evaluate the second predicate first, and Query Optimizer has no idea if the predicate evaluation would cause a runtime error, since it doesn't know the correlation between the two columns.

Note that the error happens on SQL 2000, but the query works fine on SQL 2005 because Query Optimizer happens to choose the other order. However, there is no guarantee that it won't change again in the future.

The correct way to implement such correlation is to use a CASE expression.

create view dbo.ID_view_new as
select case when category = 'ID' then convert(int, value) else NULL end ID, name from dbo.test
where category = 'ID'
go
-- this will works fine
select * from dbo.ID_view_new
where ID = 123
go

In the view definition, even though the filter indicates all rows will have category 'ID', in the select list, we still use the CASE expression to check again and return NULL otherwise. This can help prevent the error even with predicate reordering.

In a word, the order of evaluation for predicates is never guaranteed, so application logic should not depend on such order. If a predicate depends on other predicates and may cause runtime error if other conditions are not met, application should use CASE expression to make it work in any usage.

Comments