Share via


SQL Server: Perform NOT IN and Still Get the Benefit of Batch Processing

When using columnstore indexes with SQL Server 2012, using NOT IN (<subquery>) can prevent batch mode query execution, and the query may run more slowly than if batch mode had been used for the bulk of the work of the query.

If you can modify a NOT IN query to produce the same result but do most of the work with batch processing, you can speed it up significantly. Here's an example:

-- This query runs in row mode and takes about 11 seconds on a 4-core processor 
-- when dbo.Purchase contains about 101 million rows,
-- and has a columnstore index on it that includes all columns of the table.
 
-- Find distinct MediaId values present in dbo.Purchase but not in dbo.Media.
select distinct  p.MediaId
from dbo.Purchase p
where p.MediaId not in (select MediaId from dbo.Media)
order by  p.MediaId; 
-- This equivalent query runs mostly in batch mode 
-- and takes about a third of a second. 
 with ids (MediaId) as
( 
       select distinct  p.MediaId
       from dbo.Purchase p
) 
select distinct  MediaId
from ids
where MediaId not in (select MediaId from dbo.Media)
order by  MediaId;

The common table expression (CTE) named "ids" runs in batch mode (and is thus fast). The main SELECT in the query runs in row mode. However, the main SELECT only operates on a few thousand rows, so it runs fast too.

 Return to main SQL Server columnstore index tuning page.