How can you tell if an index is being used?
Whenever I’m discussing index maintenance, and specifically fragmentation, I always make a point of saying ‘Make sure the index is being used before doing anything about fragmentation’ .
If an index isn’t being used very much, but has very low page density (lots of free space in the index pages), then it will be occupying a lot more disk space than it could do and it may be worth compacting (with a rebuild or a defrag) to get that disk space back. However, usually there’s not much point spending resources to remove any kind of fragmentation when an index isn’t being used. This is especially true of those people who rebuild all indexes every night or every week.
You could even go so far as to say if a non-clustered index isn’t being used, why is it there at all? Extra non-clustered indexes drag down performance in a number of ways. Consider a non-clustered index called IX_MyNCIndex on the table MyTable:
- Any time a record is inserted into MyTable, a matching record is inserted into IX_MyNCIndex. This is a bunch of extra IOs, plus maybe even a page-split.
- Any time a record is deleted from MyTable, the matching record in IX_MyNCIndex must be deleted. Extra IOs again.
- Any time a record in MyTable is updated:
- If MyTable has a clustered index, and the clustered index key value changes, then the matching record in IX_MyNCIndex must be updated. Extra IOs again.
- If any of the non-clustered index key values changes, or any of the INCLUDEd column values changes, then the matching record in IX_MyNCIndex must be updated. Extra IOs again.
- If a clustered index is created on MyTable, then IX_MyNCIndex has to be rebuilt to include the logical RIDs rather than the physical heap RIDs. Lot of extra IOs.
That’s a significant amount of extra IOs to maintain each extraneous non-clustered index.
So, how can you tell if an index is being used? There are a few different ways in SQL Server 2005 – the one I want to discuss in this post is using the sys.dm_db_index_usage_stats DMV.
This DMV exposes the information that is tracked about index usage (as the name suggests). It does not generate any information itself; it just returns info from a cache inside SQL Server. This cache is empty when the server instance starts, and is not persisted across instance restarts. All cache entries for indexes in a database are removed when that database is closed. So, the cache tracks usage information about indexes since the database they are part of was last opened (either manually or as part of instance start-up).
The cache tracks the following info for each index (for user queries and system queries):
· The number of times it was used in a seek operation (either looking up a single row, or doing a range scan) along with the time of the last seek.
· The number of times it was used in a scan operation (e.g. a select * operation) along with the time of the last scan
· The number of times it was used in a lookup operation (this means a bookmark lookup – where a non-clustered index does not fully cover a query and additional columns must be retrieved from the base table row) along with the time of the last lookup.
· The number of times it was used in an update operation (this counts inserts, updates, and deletes) along with the time of the last update.
Let’s have a look at its use.
SELECT
* FROM sys.dm_db_index_usage_stats;
GO
Unless you've just re-started your instance, you'll see a bunch of output from this, representing all index activity since the instance/databases started. If you're interested in whether an index is being used, you can filter the output. Let's focus in on a particular table - AdventureWorks.Person.Address.
SELECT
* FROM sys.dm_db_index_usage_stats
WHERE
database_id = DB_ID('AdventureWorks')
and
object_id = OBJECT_ID('AdventureWorks.Person.Address');
GO
You'll probably see nothing in the output, unless you've been playing around with that table. Let's force the clustered index on that table to be used, and look at the DMV output again.
SELECT
* FROM AdventureWorks.Person.Address;
GO
SELECT
* FROM sys.dm_db_index_usage_stats
WHERE
database_id = DB_ID('AdventureWorks')
and
object_id = OBJECT_ID('AdventureWorks.Person.Address');
GO
Now there's a single row, showing a scan on the clustered index. Let's do something else.
SELECT
StateProvinceID FROM AdventureWorks.Person.Address
WHERE
StateProvinceID > 4 AND StateProvinceId < 15;
GO
SELECT
* FROM sys.dm_db_index_usage_stats
WHERE
database_id = DB_ID('AdventureWorks')
and
object_id = OBJECT_ID('AdventureWorks.Person.Address');
GO
And there's another row, showing a seek in one of the table's non-clustered indexes.
So, its easy to look at the index usage for particular tables and indexes. But how can you monitor this over time? This is easy too - let's see how.
First we need to create our own table to store snapshots of the DMV output.
IF
OBJECTPROPERTY(object_id(N'master.dbo.MyIndexUsageStats'), 'IsUserTable') = 1
DROP TABLE dbo.MyIndexUsageStats;
GO
SELECT
GETDATE () AS ExecutionTime, *
INTO
master.dbo.MyIndexUsageStats
FROM
sys.dm_db_index_usage_stats WHERE database_id=0;
GO
Next we need to take a baseline snapshot of the DMV output.
INSERT
master.dbo.MyIndexUsageStats
SELECT getdate (), * FROM sys.dm_db_index_usage_stats;
GO
And now simulate a few operations and take another snapshot of the DMV:
SELECT
* FROM AdventureWorks.Person.Address;
GO
SELECT
* FROM AdventureWorks.Person.Address;
GO
SELECT
StateProvinceID FROM AdventureWorks.Person.Address
WHERE
StateProvinceID > 4 AND StateProvinceId < 15;
GO
INSERT
master.dbo.MyIndexUsageStats
SELECT getdate (), * FROM sys.dm_db_index_usage_stats;
GO
And look at the filtered contents of our snapshot table:
SELECT
* FROM master.dbo.MyIndexUsageStats
WHERE
database_id = DB_ID('AdventureWorks')
and
object_id = OBJECT_ID('AdventureWorks.Person.Address');
GO
You should see four rows - two from the baseline snapshot and two from the final snapshot. If you ran just the statements above, you'll see that the user_scans count for the clustered index has increased by two, and the user_seeks count for the non-clustered index has increased by one.
So this is a pretty simple example of how you can track index usage. By putting something like this into a regularly run script you can tell which indexes aren't being used and could be candidates for less-regular index maintenance or removal altogether.
Let me know how you get on.
(Btw - still looking for more people to fill out my VLDB Maintenance survey...)
Comments
Anonymous
April 22, 2007
On day-to-day basis a DBA might come across with the issues on the fragmentation on the database, itAnonymous
April 24, 2007
Hi Paul, Great information and was useful for me. BTW, is there any equivalent code for SQL server 2000? I tried searching for the same,but wasn't successful.Anonymous
April 24, 2007
No - unfortunately there's no equivalent in SQL Server 2000. You'd need to profile a workload, run it through the Index Tuning Wizard and see which indexes ITW recommends dropping. Nowhere near as good.Anonymous
April 30, 2007
One evening last week I sat down with Kimberly for 5 minutes to come up with a top-of-our heads listAnonymous
August 09, 2007
转载、解释一下来自SQL Server Storage Team的VLDB维护建议清单。 虽然主题是维护,但相信理解这些对于从事数据库开发的人员来说也有一定价值。Anonymous
December 26, 2007
This entry is going to be a collected list of practices for index, fragmentation, statistics, and general...Anonymous
June 08, 2011
I believe I understand what are user_seeks, user_scans and user_lookups in sys.dm_db_index_usage_stats, but do not really understand what are system_seeks, system_scans and system_lookups in that same DMV. Can you give some details ?Anonymous
April 06, 2015
The above mentioned DMVs will give you Index usage stats, but there are some gotchas, like these stats could be wrong at times because of several reasons:
- If SQL Server (MSSQLSERVER) service is re-started these counters are initialized to 0.
- When a Database is detached or is shut down (for example, because AUTO_CLOSE is set to ON), all rows associated with the database are removed.
- Index Rebuild resets these counters to 0 (Bug logged in MS Connect, link: connect.microsoft.com/.../rebuilding-an-index-clears-stats-from-sys-dm-db-index-usage-stats). ~Manoj [documented in my blog post: sqlwithmanoj.com/.../index-usage-stats-indexes-used-unused-and-updated]
- Anonymous
April 06, 2015
@Nicolas, the System_xyz columns returns the stats from the background processes. Whenever something triggers within SQL Server, like Statistics update, this activity will be tracked in these columns rather than user_xyz columns. ~Manoj [http://sqlwithmanoj.com]