Detecting Overlapping Indexes in SQL Server 2005
When SQL Server has an optimal index that satisfies the search predicates of a query the optimizer performs an index SEEK operation as opposed to an index (or table) scan to retrieve the required rows; this is desirable. Based on this, one may be led to believe that having an index for every possible query predicate set would result in all the queries executing optimally. While true, one has to keep in mind that the indexes need to be maintained when the underlying table data in the column included in the index changes, which amounts to overhead for the database engine. So as you may guess, there are advantages of having indexes, but having too many can result in excessive overhead. This implies that you need to carefully evaluate the pros and cons before creating indexes.
In your first evaluation scenario you clearly want to avoid the case of having overlapping indexes as there is no additional value that an overlapping index provides. For example, consider a table ‘TabA’ and its three associated indexes created with the following definitions.
CREATE TABLE TabA
( Col1 INT, Col2 INT, Col3 INT, Col4 INT );
GO
CREATE INDEX idx1 ON TabA ( Col1, Col2, Col3 );
CREATE INDEX idx2 ON TabA ( Col1, Col2 );
CREATE INDEX idx3 ON TabA ( Col1 DESC, Col2 DESC );
GO
In the table structure above, the index idx1 is a superset (overlap) of the index idx2 and therefore redundant. As can be expected any query that needs to perform a search on Col1 and Col2 could use index idx1 just as well as the index idx2 as seen in the graphical query plan below.
Such overlapping indexes are often a result of multiple developers working on the same product and not evaluating and understanding the existing schema before making additions. Once created, detecting such overlapping indexes in a database can often be a laborious task requiring detailed analysis. More importantly, most DBAs do not like to disable or drop indexes because they are not certain of the queries they were created to help with and fear the side effects the action may have.
The script below uses the new system catalog introduced in SQL Server 2005 to report all duplicate indexes in the current database context.
CREATE FUNCTION dbo.INDEX_COL_PROPERTIES (@TabName nvarchar(128), @IndexId INT, @ColId INT)
RETURNS INT
WITH EXECUTE AS CALLER
AS
BEGIN
DECLARE @IsDescending INT;
SELECT @IsDescending = is_descending_key
FROM sys.index_columns SYSIDXCOLS
WHERE OBJECT_ID(@TabName) = SYSIDXCOLS.object_id
AND @IndexId = SYSIDXCOLS.index_id
AND @ColId = SYSIDXCOLS.key_ordinal;
-- Return the value of @IsDescending as the property
RETURN(@IsDescending);
END;
GO
-- Find Duplicate Indexes in SQL Server Database
CREATE VIEW IndexList_VW AS
SELECT
SYSSCH.[name] AS SchemaName,
SYSOBJ.[name] AS TableName,
SYSIDX.[name] AS IndexName,
SYSIDX.[is_unique] AS IndexIsUnique,
SYSIDX.[type_desc] AS IndexType,
SYSIDX.[is_disabled] AS IsDisabled,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 1 ) AS Column1,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 2 ) AS Column2,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 3 ) AS Column3,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 4 ) AS Column4,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 5 ) AS Column5,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 6 ) AS Column6,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 7 ) AS Column7,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 8 ) AS Column8,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 9 ) AS Column9,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 10 ) AS Column10,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 11 ) AS Column11,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 12 ) AS Column12,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 13 ) AS Column13,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 14 ) AS Column14,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 15 ) AS Column15,
INDEX_COL( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 16 ) AS Column16,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 1 ) AS Column1_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 2 ) AS Column2_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 3 ) AS Column3_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 4 ) AS Column4_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 5 ) AS Column5_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 6 ) AS Column6_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 7 ) AS Column7_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 8 ) AS Column8_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 9 ) AS Column9_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 10 ) AS Column10_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 11 ) AS Column11_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 12 ) AS Column12_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 13 ) AS Column13_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 14 ) AS Column14_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 15 ) AS Column15_Prop,
dbo.INDEX_COL_PROPERTIES( SCHEMA_NAME(SYSOBJ.SCHEMA_ID)+'.'+SYSOBJ.[name], SYSIDX.index_id, 16 ) AS Column16_Prop
FROM sys.indexes SYSIDX INNER JOIN sys.objects SYSOBJ ON SYSIDX.[object_id] = SYSOBJ.[object_id]
INNER JOIN sys.schemas SYSSCH ON SYSOBJ.schema_id = SYSSCH.schema_id
WHERE SYSIDX.index_id > 0
AND INDEXPROPERTY(SYSOBJ.[object_id], SYSIDX.[name], 'IsStatistics') = 0
GO
SELECT
vwA.SchemaName,
vwA.TableName,
vwA.IndexIsUnique,
vwA.IndexType,
vwA.IndexName,
vwB.IndexName AS OverlappingIndex,
vwA.Column1, vwA.Column2, vwA.Column3, vwA.Column4, vwA.Column5, vwA.Column6, vwA.Column7, vwA.Column8,
vwA.Column9, vwA.Column10, vwA.Column11, vwA.Column12, vwA.Column13, vwA.Column14, vwA.Column15, vwA.Column16
FROM IndexList_VW vwA INNER JOIN IndexList_VW vwB ON vwA.TableName=vwB.TableName
AND vwA.IndexName <> vwB.IndexName
AND vwA.IsDisabled = 0
AND vwB.IsDisabled = 0
AND (vwA.Column1=vwB.Column1 AND vwA.Column1_Prop=vwB.Column1_Prop)
AND ((vwA.Column2=vwB.Column2 AND vwA.Column2_Prop=vwB.Column2_Prop) OR vwA.Column2 IS NULL OR vwB.Column2 IS NULL)
AND ((vwA.Column3=vwB.Column3 AND vwA.Column3_Prop=vwB.Column3_Prop) OR vwA.Column3 IS NULL OR vwB.Column3 IS NULL)
AND ((vwA.Column4=vwB.Column4 AND vwA.Column4_Prop=vwB.Column4_Prop) OR vwA.Column4 IS NULL OR vwB.Column4 IS NULL)
AND ((vwA.Column5=vwB.Column5 AND vwA.Column5_Prop=vwB.Column5_Prop) OR vwA.Column5 IS NULL OR vwB.Column5 IS NULL)
AND ((vwA.Column6=vwB.Column6 AND vwA.Column6_Prop=vwB.Column6_Prop) OR vwA.Column6 IS NULL OR vwB.Column6 IS NULL)
AND ((vwA.Column7=vwB.Column7 AND vwA.Column7_Prop=vwB.Column7_Prop) OR vwA.Column7 IS NULL OR vwB.Column7 IS NULL)
AND ((vwA.Column8=vwB.Column8 AND vwA.Column8_Prop=vwB.Column8_Prop) OR vwA.Column8 IS NULL OR vwB.Column8 IS NULL)
AND ((vwA.Column9=vwB.Column9 AND vwA.Column9_Prop=vwB.Column9_Prop) OR vwA.Column9 IS NULL OR vwB.Column9 IS NULL)
AND ((vwA.Column10=vwB.Column10 AND vwA.Column10_Prop=vwB.Column10_Prop) OR vwA.Column10 IS NULL OR vwB.Column10 IS NULL)
AND ((vwA.Column11=vwB.Column11 AND vwA.Column11_Prop=vwB.Column11_Prop) OR vwA.Column11 IS NULL OR vwB.Column11 IS NULL)
AND ((vwA.Column12=vwB.Column12 AND vwA.Column12_Prop=vwB.Column12_Prop) OR vwA.Column12 IS NULL OR vwB.Column12 IS NULL)
AND ((vwA.Column13=vwB.Column13 AND vwA.Column13_Prop=vwB.Column13_Prop) OR vwA.Column13 IS NULL OR vwB.Column13 IS NULL)
AND ((vwA.Column14=vwB.Column14 AND vwA.Column14_Prop=vwB.Column14_Prop) OR vwA.Column14 IS NULL OR vwB.Column14 IS NULL)
AND ((vwA.Column15=vwB.Column15 AND vwA.Column15_Prop=vwB.Column15_Prop) OR vwA.Column15 IS NULL OR vwB.Column15 IS NULL)
AND ((vwA.Column16=vwB.Column16 AND vwA.Column16_Prop=vwB.Column16_Prop) OR vwA.Column16 IS NULL OR vwB.Column16 IS NULL)
ORDER BY
vwA.TableName, vwA.IndexName
GO
-- Drop function and view created above.
DROP FUNCTION dbo.INDEX_COL_PROPERTIES
GO
DROP VIEW IndexList_VW
GO
Executing the script in a test database with the table (TabA) mentioned above produces output similar to the following.
Indexes idx1 and idx2 are reported to have an overlap, and in this case the subset index that has fewer columns (idx2) can be disabled without any negative side-effects (provided there are no Index hints pointing to this index. This is explained in more detail below). This will result in the same performance for select statements but eliminate the overhead of the database engine having to maintain 2 indexes that serve the same general purpose. This is particularly important if the underlying columns (Col1 and Col2) are highly volatile.
A couple of additional points to keep in mind when using this script:
- Overlapping indexes are only those that have the columns in the same order. For example an index created on Col1, Col2, Col3 (in that order) does not overlap with an index created on Col2, Col1, Col3, even though the columns included are the same
- Indexes created on the same set of columns with the same column order, but with different sort order specifications (ASC, DESC) are distinct indexes and are not overlapping
- Indexes with included columns introduced in SQL Server 2005 are correctly reported as overlapping by the above script. For example, an index created as Col1, Col2 and Col3 as an included column overlaps with the index created on Col1, Col2, Col3. For overlapping indexes that involve included columns it is imperative to disable or drop the index with the higher number of included columns (refer to: https://msdn2.microsoft.com/en-us/library/ms188783(SQL.90).aspx for more information about indexes with included columns)
- Clustered indexes and non-clustered indexes are clearly marked in the script output. If a clustered index is reported as overlapped and you plan to drop it, you may want to take some additional steps to make sure that the overlapping index, or some other index, is created as clustered, because it is a SQL Server best practice to have a clustered index on every table (refer to: https://www.microsoft.com/technet/prodtechnol/sql/bestpractice/clusivsh.mspx for more information)
- It is recommended to either disable a duplicate index or save its definition before dropping it, in order to be able to recreate it if necessary
- Before disabling or dropping an index you should make sure that there are no index hints in your application that explicitly use the index being disabled, for example: SELECT * FROM TabA WITH (INDEX (idx2)) WHERE col1 = 28. Disabling or dropping an index that is explicitly used via an index hint will result in the query execution failing with an error message 308 or 315.
Cross Posted from https://blogs.microsoft.com/mssqlisv
Comments
Anonymous
January 01, 2008
PingBack from http://msdn.blogsforu.com/msdn/?p=4002Anonymous
June 19, 2008
Come tutti sappiamo una corretta strategia di indicizzazione è uno dei pilastri più importanti nell'implementazioneAnonymous
January 27, 2010
What about included columns ? If index has col1,col2 and another has col1 and (col2,col3,col4) as included They shouldn't be the same hereAnonymous
January 27, 2010
oh , saw the part about included columns THX