Saturday 10 October 2009

Finding Index Usage Statistics for a Table having multiple Indexes

Indexing tables is the best way to improve your table query performance. Having millions of records in a table without indexes will not only make your table query perform very slow but also your client applications using those queries will have adverse effects.

However, creating indexes on tables will not necessarily improve query performance, but creating right indexes and correct number of indexes also matters. Recently, I was working in a project to store several data validation logs in the database and then reporting those outstanding error logs to QA and Analysts. Tables contained not thousands but millions of logs stored in XML format. There were several types of reports created so I was required to write different types of SQL queries to retrieve data from the Logs table. In the development environment I created indexes which were required for my queries to perform well. Soon I was ended up with 4 – 6 indexes on my Logs table.

Unfortunately, user did not wanted any latency to the reports so we could not use OLAP for reporting purpose but rather had to query directly from OLTP database. This means that at the same time when more validation logs were produced other user was querying database table to produce report. Soon, I started to see adverse effects of too many indexes on my Logs table because when any DML statements were used on Logs table, all indexes where also updated.

Although my queries for reporting purpose were performing well enough, user processes were not on the other hand. It would have been too time consuming for me to run all report queries and find out with indexes where utilized and which were not. So I decided to used index meta data information which SQL Server maintains to find out index usage states. SQL Server 2005 and above ships with sys.dm_db_index_usage_stats dynamic management view which you can query to retrieve index usage statistics for one or more tables.

USE [AdventureWorks]
GO

SELECT *
FROM Sales.[SalesOrderHeader]
WHERE [CustomerID] = 227
GO

SELECT a.index_id,
b.name,
a.user_seeks,
a.user_scans,
a.user_lookups,
a.user_updates,
a.last_user_seek,
a.last_user_scan,
a.last_user_lookup
FROM sys.dm_db_index_usage_stats a
INNER JOIN sys.indexes b ON a.object_id = b.object_id
AND a.index_id = b.index_id
WHERE a.object_id = OBJECT_ID('Sales.SalesOrderHeader')
GO


Running first query will get all columns from Sales.SalesOrderHeader table for CustomerId 227. Running next query will give information about indexes on Sales.SalesOrderHeader table. It gives you name of the indexes along with other information such as when those indexes where used and how (seek, scan or lookup).


I found this a great way to find out those indexes on my table which have been never used but still getting updated because of DML activities on the table. To improve our DML performance I can easily drop those indexes which are never used.


Other way to perform similar action would be to use Database Engine Advisor (DTA). But sometimes in UAT or Production server you may not have permissions to use DTA. Running above query would be great way to find out unused indexes then ask DBAs to drop those indexes.