SCAN COUNT meaning in SET STATISTICS IO output
What does the Scan Count output in the SET STATISTICS IO output actually mean? I have seen multiple posts on the web regarding confusions and theories around this.
As per Books Online, Scan Count is: Number of index or table scans performed.
During the course of this post, I shall point out examples and some thumb rules which will help you understand the Scan Count value better. I looked into the code and found that the Scan Count is calculated based on the number of scans started for fetching the resultant data set.
Thumb rules
1. Scan count is 0 if the index that you are using is a unique index or clustered index on a primary key and you are seeking for only one value. Eg. WHERE Primary_Key_Column = <value>
2. Scant count is 1 when you are searching for one value using a non-unique clustered index which is defined on a non-primary key column. This is done to check for duplicate values for the key value that you are searching for. Eg. WHERE Clustered_Index_Key_Column = <value>
3. Scan count is N when N is the number of different seek/scan started towards the left or right side at the leaf level after locating a key value using the index key.
Script used to populate the data in the tables used in the examples below:
create table tbl2 (a int, b int)
declare @var int
set @var = 1
while @var <= 1000
begin
insert into tbl2 values (@var,@var+1)
set @var = @var + 1
end
create clustered index cidx_a on tbl2(a)
create index ncidx_b on tbl2(b)
create table tbl3 (a int primary key, b int)
declare @var int
set @var = 1
while @var <= 1000
begin
insert into tbl3 values (@var,@var+1)
set @var = @var + 1
end
create index ncidx_b on tbl3(b)
create table tbl4 (a int, b int)
declare @var int
set @var = 1
while @var <= 1000
begin
insert into tbl4 values (1,@var+1)
set @var = @var + 1
end
create clustered index cidx_a on tbl4(a)
create index ncidx_b on tbl4(b)
create table tbl5 (a int, b int)
declare @var int
set @var = 1
while @var <= 1000
begin
insert into tbl4 values (1,@var+1)
set @var = @var + 1
end
create unique index cidx_a on tbl5(a)
create index ncidx_b on tbl5(b)
Tbl2 has two columns a, b with no duplicate values with a clustered index on a and a non-clustered index on b.
Tbl3 has two columns a, b with a clustered index on primary key column a and a non-clustered index on b.
Tbl4 has two columns a, b with duplicate values for column a with a clustered index on a and a non-clustered index on b.
Tbl5 has two columns a, b with no duplicate values for column a with a unique non-clustered index on a and a non-clustered index on b.
Scan count is 0 if the index that you are using is a unique index or clustered index on a primary key and you are seeking for only one value
The following queries show a scan count as 0:
1. select * from tbl5 where a = 1
2. select * from tbl3 where a = 1
Output: Table 'tbl3'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
The reason for this is that you have a unique index defined on column a on tbl5 which tells the engine that there is only one value in the table that satisfies the criteria a=1. Similar logic applies, when you are using the primary key column to search for a value.
Scant count is 1 when you are searching for one value using a non-unique clustered index which is defined on a non-primary key column. This is done to check for duplicate values for the key value that you are searching for
The following queries show a scan count as 1:
1. select * from tbl2 where a between 1 and 10
2. select * from tbl3 where b between 1 and 10
3. select * from tbl4 where a between 1 and 10
4. select * from tbl3 with (index (ncidx_b)) where a between 1 and 10
Output: Table 'tbl4'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Since we are searching for values using the clustered index column and performing a scan/seek in one direction only after location the first value i.e. a = 1. This will be obvious from the query plan output of the above queries.
Scan count is N when N is the number of different seek/scan started towards the left or right side at the leaf level after locating a key value using the index key
If you look at the query below, you will find that the Scan Count is 2.
select * from tbl2 where a = 1 or a = 2
Output : Table 'tbl2'. Scan count 2, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
For all three tables, the same query above shows a Scan Count = 2. The reason can be found in the plan as it is an Ordered Forward Seek with two different key values. The more the number of OR clause entries, you will see the Scan Count increasing proportionately.
|--Clustered Index Seek(OBJECT:([AdventureWorks].[dbo].[tbl2].[cidx_a]), SEEK:([AdventureWorks].[dbo].[tbl2].[a]=(1) OR [AdventureWorks].[dbo].[tbl2].[a]=(2)) ORDERED FORWARD)
I hope that the above examples help clarify some misconceptions around Scan Count. The Scan Count is not necessarily the deterministic factor on deciding whether a query plan being used is bad or good from an IO standpoint. You should be looking at the Physical/Logical read values first when comparing the STATISTICS IO outputs.
NOTE: I have not yet covered all the different permutations and combinations for non-clustered indexes in this post. In case you have a specific question, please feel free to comment and I shall respond accordingly.
Regards,
Amit Banerjee
SEE, Microsoft SQL support
Comments
Anonymous
September 14, 2010
Thanks for the explanation! A useful sequel would be if you consider joins, and how scan count are used (for instance the inner table for a nested loop join). Possibly pretty inuitive, but I'm sure there are lurking some surprises somewhere...Anonymous
September 14, 2010
Nice explanation. Nice to see some details about it, it´s common to see index seeks that actually do some large range scan behind the cover. Just one little correction in your script, when inserting rows in tbl5, you are actually populating tbl4. :-) ThanksAnonymous
September 23, 2010
Very good post!!!Anonymous
September 23, 2010
One question though: In the Second Thumd Rule you mentioned "using a non-unique clustered index which is defined on a non-primary key column". I know that clustered index is always unique as it adds a uniquifier even if duplicate rows exist.Anonymous
September 27, 2010
I think I need to clarify "using a non-unique clustered index which is defined on a non-primary key column" a bit more. When you use clustered index on a non-primary key column, SQL will add uniquifiers but that is used for retrieving the row data. When you ask for data to be returned using "column value" = <value>, then SQL will not scan/seek further if it's using a unique index or a clustered index on a primary key to fetch the data. If the former is not true, then it will perform a scan/seek further after locating the first value satisfying the search criteria because it needs to determine if there are additional rows in the table which satisfy the filter condition.Anonymous
February 21, 2013
Great explanation,Anonymous
May 07, 2013
should we have large number of scan counts or small scan counts for performance in sql server 2012?Anonymous
August 05, 2013
nICEAnonymous
May 01, 2014
The comment has been removedAnonymous
July 17, 2014
Useful information. Thank YouAnonymous
September 01, 2014
Dear Amit, This is really an excellent post. Since long time I was searching for the right answer of the Scan Count. After reading this post, I can understand when it is 0 and when >0. Just to clarify, I was struggling to find out why the second query showing the Scan Count = 1, since both are similar. Has a filter one the Clustering Key (PK) and both are pointing to the same record and asking to return only one record. Just wanted to confirm, the second query is showing Scan Count 1 because this is satisfying your Rule number 3. Since it is returning 1 record but, predicate pattern is a Range. And that’s why it is satisfying the rule “started towards the left or right side at the leaf level after locating a key value using the index key”. Am I right. Please confirm. -- Query: 1 SELECT BusinessEntityID, FirstName, LastName, MiddleName FROM Person.Person WHERE BusinessEntityID = 1 -- Query: 2 SELECT BusinessEntityID, FirstName, LastName, MiddleName FROM Person.Person WHERE BusinessEntityID >= 1 AND BusinessEntityID <= 1 Thanks for the excellent post.Anonymous
September 01, 2014
Dear Amit, This is really an excellent post. Since long time I was searching for the right answer of the Scan Count. After reading this post, I can understand when it is 0 and when >0. Just to clarify, I was struggling to find out why the second query showing the Scan Count = 1, since both are similar. Has a filter one the Clustering Key (PK) and both are pointing to the same record and asking to return only one record. Just wanted to confirm, the second query is showing Scan Count 1 because this is satisfying your Rule number 3. Since it is returning 1 record but, predicate pattern is a Range. And that’s why it is satisfying the rule “started towards the left or right side at the leaf level after locating a key value using the index key”. Am I right. Please confirm. -- Query: 1 SELECT BusinessEntityID, FirstName, LastName, MiddleName FROM Person.Person WHERE BusinessEntityID = 1 -- Query: 2 SELECT BusinessEntityID, FirstName, LastName, MiddleName FROM Person.Person WHERE BusinessEntityID >= 1 AND BusinessEntityID <= 1 Thanks for the excellent post.Anonymous
December 24, 2014
Am I correct to notice that Table creation script for tbl5 is incorrect. Its actually inserting data in tbl4. Plus again Scant count 1. If you create table with just one column that too indentity and do select * on that table you would see scan count 1. So this is opposite to definition of scan count 1 what you mentioned. Nonetheless good postAnonymous
May 10, 2015
Good