How can SQL Server 2005 help me evaluate and manage indexes?
SQLCAT Blog:
Question: How can SQL Server 2005 help me evaluate and manage indexes?
(1) How can I find out whether my indexes are useful? How are they used?
(2) Do I have any tables or indexes that are not used (or rarely)
(3) What is the cost of index maintenance vs. its benefit?
(4) Do I have hot spots & index contention?
(5) Could I benefit from more (or less) indexes?
Answer:
SQL Server 2005 Dynamic Management Views (DMVs) are important insofar as they expose changing server state information that typically spans many sessions, many transactions, and many requests. DMVs provide a level of transparency that was not available in SQL Server 2000 and can be used for diagnostics, memory and process tuning, and monitoring. The SQL Server engine tracks detailed resource history in DMVs that can be queriable with SELECT statements but are not persisted to disk. Thus DMVs reflect activity since the last SQL Server recycle.
Since indexes provide an alternative to a table scan, and because DMVs expose index usage counters, we can compare the cost vs. the benefit of indexes. This comparison will include the maintenance cost of keeping the index up to date, versus the benefit of reads e.g. when an index can be used in lieu of a table scan. Keep in mind that an update or delete operation involves both a read which is first required to determine whether a row qualifies for the update operation, and a write if a row is found to qualify. In an insert operation, only writes will be performed on all indexes. Consequently, in an insert-intensive workload, writes will exceed reads. In an update- intensive (updates and deletes) workload, read and write counts are generally close assuming there are not a lot of ‘records not found’. In read-intensive workloads, read counts will exceed write counts. Referential constraints such as foreign keys require other read activity (for inserts, updates, and deletes) to ensure referential integrity is maintained.
(1) How can I find out whether my indexes are useful? How are they used?
First, we will determine whether indexes are ‘useful’. DDL is used to create objects (such as indexes) and update the catalog. Creating the index does not constitute ‘use’ of the index, and thus the index will not be reflected in the index DMVs until the index is actually used. When an index is used by a Select, Insert, Update, or Delete, its use is captured by sys.dm_db_index_usage_stats. If you have run a representative workload, all useful indexes will have been recorded in sys.dm_db_index_usage_stats. Thus, any index not found in sys.dm_db_index_usage_stats is unused by the workload (since the last re-cycle of SQL Server). Unused indexes can be found as follows:
(2) Do I have any tables or indexes that are not used (or rarely used)?
------ unused tables & indexes. Tables have index_id’s of either 0 = Heap table or 1 = Clustered Index
Declare @dbid int
Select @dbid = db_id('Northwind')
Select objectname=object_name(i.object_id)
, indexname=i.name, i.index_id
from sys.indexes i, sys.objects o
where objectproperty(o.object_id,'IsUserTable') = 1
and i.index_id NOT IN (select s.index_id
from sys.dm_db_index_usage_stats s
where s.object_id=i.object_id and
i.index_id=s.index_id and
database_id = @dbid )
and o.object_id = i.object_id
order by objectname,i.index_id,indexname asc
Rarely used indexes will appear in sys.dm_db_index_usage_stats just like heavily used indexes. To find rarely used indexes, you look at columns such as user_seeks, user_scans, user_lookups, and user_updates.
--- rarely used indexes appear first
declare @dbid int
select @dbid = db_id()
select objectname=object_name(s.object_id), s.object_id, indexname=i.name, i.index_id
, user_seeks, user_scans, user_lookups, user_updates
from sys.dm_db_index_usage_stats s,
sys.indexes i
where database_id = @dbid and objectproperty(s.object_id,'IsUserTable') = 1
and i.object_id = s.object_id
and i.index_id = s.index_id
order by (user_seeks + user_scans + user_lookups + user_updates) asc
(3) What is the cost of index maintenance vs. its benefit?
If a table is heavily updated and also has indexes that are rarely used, the cost of maintaining the indexes could exceed the benefits. To compare the cost and benefit, you can use the table valued function sys.dm_db_index_operational_stats as follows:
--- sys.dm_db_index_operational_stats
declare @dbid int
select @dbid = db_id()
select objectname=object_name(s.object_id), indexname=i.name, i.index_id
, reads=range_scan_count + singleton_lookup_count
, 'leaf_writes'=leaf_insert_count+leaf_update_count+ leaf_delete_count
, 'leaf_page_splits' = leaf_allocation_count
, 'nonleaf_writes'=nonleaf_insert_count + nonleaf_update_count + nonleaf_delete_count
, 'nonleaf_page_splits' = nonleaf_allocation_count
from sys.dm_db_index_operational_stats (@dbid,NULL,NULL,NULL) s,
sys.indexes i
where objectproperty(s.object_id,'IsUserTable') = 1
and i.object_id = s.object_id
and i.index_id = s.index_id
order by reads desc, leaf_writes, nonleaf_writes
--- sys.dm_db_index_usage_stats
select objectname=object_name(s.object_id), indexname=i.name, i.index_id
,reads=user_seeks + user_scans + user_lookups
,writes = user_updates
from sys.dm_db_index_usage_stats s,
sys.indexes i
where objectproperty(s.object_id,'IsUserTable') = 1
and s.object_id = i.object_id
and i.index_id = s.index_id
and s.database_id = @dbid
order by reads desc
go
The difference between sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats is as follows. Sys.dm_db_index_usage_stats counts each access as 1, whereas sys.dm_db_index_operational_stats counts depending on the operation, pages or rows.
(4) Do I have hot spots & index contention?
Index contention (e.g. waits for locks) can be seen in sys.dm_db_index_operational_stats. Columns such as row_lock_count, row_lock_wait_count, row_lock_wait_in_ms, page_lock_count, page_lock_wait_count, page_lock_wait_in_ms, page_latch_wait_count, page_latch_wait_in_ms, pageio_latch_wait_count, pageio_latch_wait_in_ms detail lock and latch contention in terms of waits. You can determine the average blocking and lock waits by comparing waits to counts as follows:
declare @dbid int
select @dbid = db_id()
Select dbid=database_id, objectname=object_name(s.object_id)
, indexname=i.name, i.index_id --, partition_number
, row_lock_count, row_lock_wait_count
, [block %]=cast (100.0 * row_lock_wait_count / (1 + row_lock_count) as numeric(15,2))
, row_lock_wait_in_ms
, [avg row lock waits in ms]=cast (1.0 * row_lock_wait_in_ms / (1 + row_lock_wait_count) as numeric(15,2))
from sys.dm_db_index_operational_stats (@dbid, NULL, NULL, NULL) s, sys.indexes i
where objectproperty(s.object_id,'IsUserTable') = 1
and i.object_id = s.object_id
and i.index_id = s.index_id
order by row_lock_wait_count desc
The following report shows blocks in the [Order Details] table, index OrdersOrder_Details. While blocks occur less than 2 percent of the time, when they do occur, the average block time is 15.7 seconds.
It would be important to track this down using the SQL Profiler Blocked Process Report. You can set the Blocked Process Threshold to 15 using sp_configure ‘Blocked Process Threshold’,15. Afterwards, you can run a trace to capture blocks over 15 seconds.
The Profiler trace will include the blocked and blocking process. The advantage of tracing for long blocks is the blocked and blocking details can be saved in the trace file and can be analyzed long after the block disappears. Historically, you can see the common causes of blocks. In this case the blocked process is the stored procedure NewCustOrder. The blocking process is the stored procedure UpdCustOrderShippedDate.
The caveat with Profiler Trace of Blocked Process Report is that in the case of stored procedures, you cannot see the actual statement within the stored procedure that is blocked. You can capture the actual blocked statement of a stored procedure in realtime (as it is occuring) using the following:
create proc sp_block_info
as
select t1.resource_type as [lock type]
,db_name(resource_database_id) as [database]
,t1.resource_associated_entity_id as [blk object]
,t1.request_mode as [lock req] --- lock requested
,t1.request_session_id as [waiter sid] --- spid of waiter
,t2.wait_duration_ms as [wait time]
,(select text from sys.dm_exec_requests as r --- get sql for waiter
cross apply sys.dm_exec_sql_text(r.sql_handle)
where r.session_id = t1.request_session_id) as waiter_batch
,(select substring(qt.text,r.statement_start_offset/2,
(case when r.statement_end_offset = -1
then len(convert(nvarchar(max), qt.text)) * 2
else r.statement_end_offset end - r.statement_start_offset)/2)
from sys.dm_exec_requests as r
cross apply sys.dm_exec_sql_text(r.sql_handle) as qt
where r.session_id = t1.request_session_id) as waiter_stmt --- statement blocked
,t2.blocking_session_id as [blocker sid] -- spid of blocker
,(select text from sys.sysprocesses as p --- get sql for blocker
cross apply sys.dm_exec_sql_text(p.sql_handle)
where p.spid = t2.blocking_session_id) as blocker_stmt
from
sys.dm_tran_locks as t1,
sys.dm_os_waiting_tasks as t2
where
t1.lock_owner_address = t2.resource_address
go
exec sp_block_info
(5) Could I benefit from more (or less) indexes?
Remembering that indexes involve both a maintenance cost and a read benefit, the overall index cost benefit can be determined by comparing reads and writes. Reading an index allows us to avoid table scans however they do require maintenance to be kept up-to-date. While it is easy to identify the fringe cases where indexes are not used, and the rarely used cases, in the final analysis, index cost benefit is somewhat subjective. The reason is the number of reads and writes are highly dependent on the workload and frequency. In addition, qualitative factors beyond the number of reads and writes can include a highly important monthly management report or quarterly VP report in which the maintenance cost is of secondary concern.
Writes of all indexes are performed for inserts, but there are no associated reads (unless there are referential constraints). Besides select statements, reads are performed for updates and deletes, writes are performed if rows qualify. OLTP workloads have lots of small transactions, frequently combining select, insert, update and delete operations. Data Warehouse activity is typically separated into batch windows having a high concentation of write activity, followed by an on-line window of read activity.
SQL Statement |
Read |
Write |
Select |
Yes |
No |
Insert |
No |
Yes, all indexes |
Update |
Yes |
Yes, if row qualifies |
Delete |
Yes |
Yes, if row qualifies |
In general, you want to keep indexes to a funtional minimum in a high transaction OLTP environment due to high transaction throughput combined with the cost of index maintenance and potential for blocking. In contrast, you pay for index maintenance once during the batch window when updates occur for a data warehouse. Thus, data warehouses tend to have more indexes to benefit its read-intensive on-line users.
In conclusion, an important new feature of SQL Server 2005 includes Dynamic Management Views (DMVs). DMVs provide a level of transparency that was not available in SQL Server 2000 and can be used for diagnostics, memory and process tuning, and monitoring. DMVs can be useful in answering practical questions such as index usage, cost benefit of indexes, and index hot spots. Finally, DMVs are queriable with SELECT statements but are not persisted to disk. Thus they reflect changing server state information since the last SQL Server recycle.
Comments
Anonymous
December 19, 2005
I think this is a very useful article. Thanks for posting it (although I agree with Paul that the font is a little odd).
The 3 report and other examples mentioned in item 4 after "the average block time is 15.7 seconds." are completely blank with my version of IE. Did anyone else notice this?Anonymous
December 28, 2005
Very useful article. I wish more developers knew how to evaluate index usefulness. I administer some databases created by software that include indexes on bit value columns ...
I cannot see the items that Terry Grigon reported missing either ... tried IE, Netscape, Firefox, and Opera. Perhaps they are not there?Anonymous
February 13, 2006
This is a great article! Keep posts like this coming ;-)
I too see indexes on completely useless fields but the problem is that sometimes indexes may seem usefull but in reality they might be unused. This is very hard to capture in SQL Server 2000. The new DMV's make this a very easy job.Anonymous
March 14, 2006
Good article .. same problem .. cannot see the graphics ..Anonymous
May 29, 2006
Very nice article; I wish article(s) like these are directly linked from SQL BOL.Anonymous
December 10, 2006
The comment has been removedAnonymous
December 27, 2006
SQL Minds, out of North Carolina, is claiming many things, but not admitting their relaltionship to a STOCK SCAM called Vision Technology Corp. Plamen Slavov (Product Manager) is a realtor from Florida, not an SQL guy, or even an IT guy. Ivan Ivanov, CEO of SQL Minds is also the CTO and board member of Vision Technology Corp (VSTC), and is also the CEO of Data Consulting Group (SQL Minds one and only testimonial and the "beta" customer announced by VSTC.. meaning they are beta'ing their own stuff) THIS IS A SCAM. Read all about it here http://gtxcthegangthatcantshootstraight.blogspot.com/ cites, facts, research, analysis.. over 9 months of inside information. INVESTOR AND SQL COMMUNITY BEWARE. SQL Minds is a scam.Anonymous
July 12, 2007
Actually what started as a small document has continued to grow. I'm working on performance tuning aAnonymous
August 26, 2007
This object provides very useful stats on how many times an index has been used, locked, waited on, etc.Anonymous
December 25, 2007
PingBack from http://joesack.com/WordPress/?p=96Anonymous
January 22, 2009
PingBack from http://www.hilpers.fr/918538-index-et-optimisation/2