Wednesday, February 19, 2020

When is an index redundant?

Recently I was called into a shop that had a 4 terabyte database, who claimed they didn’t have anywhere near that much data. Upon investigation, they were correct. They had about 100 gig of data, and about 3.9 T of indexes.

Is this too much? Well, in order to make that kind of judgement call there are several factors to consider:

  1. Are the indexes interfering with writes (i.e. is the number of indexes having a negative factor on insert / update performance)?
  2. Are the indexes actually being used?
  3. Are any indexes redundant?

The first one (write performance) can be identified by checking length of the write queues; almost any monitoring tool can get you this information.

The second can be identified with a script you can find on the web which counts the number of times an index has been accessed since the last time the server was booted. Of course, you want to the server to have been up long enough that the system tables are aware of weekend / month end processing. If you can’t find one readily, you can use the one I have on my blog:

The final one is the tool I used to reduce the database size for that customer from 4T to 300M without even doing “what’s being used” analysis… just by eliminating redundant indexes.

Many people find a script on the web (I have one in the blog mentioned above) which will look at the system tables and recommend missing indexes. The problem is, it will give you 6 similar indexes rather than one combined index. You have to do this yourself. If you automatically add all the indexes the server suggests… well, you can end up with 4T of indexes for 100G of data.

So how do we know what’s redundant?

Hopefully, you are old enough to remember a phone book. A phone book is an index on last name, first name, middle name. But what other indexes might that be used for? A lookup on just last name, first name could use the same phone book (and often does), as would a lookup just to find last name.

So if you have three indexes:

  1. Last name, first name, middle name
  2. Last name, first name
  3. Last name

You can visually look at these and KNOW that the last 2 are redundant because the first index can do the work that either of the second two could do.

As you start looking at data management view (DMV) information, you often get index recommendations along these lines:

  1. Create index idx on TableA (Column1)
  2. Create index idx on TableA (Column1) include (ColumnA, ColumnB)
  3. Create index idx on TableA (Column1, Column2) include (ColumnC, ColumnD)

Sometimes you’ll get 8 or 10 of these.

So, index 1 can easily be eliminated because index 2 covers it.
Index 2 can be combined into index 3 like this:

  1. Create index idx on TableA (Column1, Column2) include (ColumnA, ColumnB , ColumnC, ColumnD)

So… in summary, you can pull index recommendations from the DMVs, and they are good recommendations, but review them with an eye for detail before automatically applying them.

Wednesday, July 31, 2019

List indexes with keys by table

Note: Code borrowed from:

... and modified to suit my needs.

  schema_name(schema_id) as SchemaName, OBJECT_NAME(si.object_id) as TableName, as IndexName,
  (CASE is_primary_key WHEN 1 THEN 'PK' ELSE '' END) as PK,
  (CASE is_unique WHEN 1 THEN '1' ELSE '0' END)+' '+
  (CASE si.type WHEN 1 THEN 'C' WHEN 3 THEN 'X' ELSE 'B' END)+' '+  -- B=basic, C=Clustered, X=XML
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,1,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,2,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,3,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,4,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,5,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  (CASE INDEXKEY_PROPERTY(si.object_id,index_id,6,'IsDescending') WHEN 0 THEN 'A' WHEN 1 THEN 'D' ELSE '' END)+
  '' as 'Type',
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,1) as Key1,
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,2) as Key2,
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,3) as Key3,
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,4) as Key4,
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,5) as Key5,
  INDEX_COL(schema_name(schema_id)+'.'+OBJECT_NAME(si.object_id),index_id,6) as Key6
FROM sys.indexes as si
LEFT JOIN sys.objects as so on so.object_id=si.object_id
WHERE index_id>0 -- omit the default heap
  and OBJECTPROPERTY(si.object_id,'IsMsShipped')=0 -- omit system tables
  and not (schema_name(schema_id)='dbo' and OBJECT_NAME(si.object_id)='sysdiagrams') -- omit sysdiagrams
ORDER BY SchemaName ,TableName , Key1, Key2, Key3, Key4, Key5, Key6

This version excludes tables with only one index:


Tuesday, April 30, 2019

Excessive compile times

This code was brought to my attention by an associate, Valentin Iamandi, who mentioned it was from Jonathan Kehayias ... we have a client who may be victim of a SQL Server compilation bug (3 seconds compile time, 75ms execution time, this is to identify other potential victims.

-- Find high compile resource plans in the plan cache
qs.total_elapsed_time/1000 AS duration_ms,
qs.total_worker_time/1000 as cputime_ms,
(qs.total_elapsed_time/qs.execution_count)/1000 AS avg_duration_ms,
(qs.total_worker_time/qs.execution_count)/1000 AS avg_cputime_ms,
qs.max_elapsed_time/1000 AS max_duration_ms,
qs.max_worker_time/1000 AS max_cputime_ms,
SUBSTRING(st.text, (qs.statement_start_offset / 2) + 1,
(CASE qs.statement_end_offset
ELSE qs.statement_end_offset
END - qs.statement_start_offset) / 2 + 1) AS StmtText,
c.value('xs:hexBinary(substring((@QueryHash)[1],3))', 'varbinary(max)') AS QueryHash,
c.value('xs:hexBinary(substring((@QueryPlanHash)[1],3))', 'varbinary(max)') AS QueryPlanHash,
c.value('(QueryPlan/@CompileTime)[1]', 'int') AS CompileTime_ms,
c.value('(QueryPlan/@CompileCPU)[1]', 'int') AS CompileCPU_ms,
c.value('(QueryPlan/@CompileMemory)[1]', 'int') AS CompileMemory_KB,
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY qp.query_plan.nodes('ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS n(c)
) AS tab
JOIN sys.dm_exec_query_stats AS qs
ON tab.QueryHash = qs.query_hash
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
ORDER BY CompileTime_ms DESC

Wednesday, April 13, 2016

Unused indexes -- updated script


DECLARE @TempIndexDrop TABLE (
       [Database_Name] [nvarchar](128) NULL,
       [Object_Name] [nvarchar](128) NULL,
       [Index_Name] [sysname] NULL,
       [Drop_Statement] [nvarchar](283) NULL,
       [user_seeks] [bigint]  NULL,
       [user_scans] [bigint]  NULL,
       [user_lookups] [bigint]  NULL,
       [SpaceSavedMeg] decimal (12,2)  null

SET @cmd = 'USE [?] ;
select db_name(database_id) as ''Database_Name'', object_name(i.object_id) as ''Object_Name'', as ''Index_Name''
, ''USE ['' + db_name(database_id) + '']; drop index [''  + + ''] on ['' + + ''].['' + + '']'' as ''Drop_Statement'',
user_seeks, user_scans, user_lookups,
(select sum(s.used_page_count)*8./1024. from sys.dm_db_partition_stats s where i.object_id = s.object_id and i.index_id = s.index_id)
from sys.dm_db_index_usage_stats ius join sys.indexes i
on ius.index_id = i.index_id and ius.object_id = i.object_id join sys.objects obj
on ius.object_id = obj.object_id join            sys.schemas sch
on            obj.schema_id = sch.schema_id
database_id = db_id(''?'') and
user_seeks + user_scans + user_lookups = 0
and not like ''pk%''
and ''?'' not in (''master'', ''model'',''tempdb'',''msdb'')

order by       object_name(i.object_id),'

INSERT INTO @TempIndexDrop 
Exec sp_msforeachdb @cmd

SELECT SUM(SpaceSavedMeg) as 'Potential Storage Savings' FROM  @TempIndexDrop

FROM @TempIndexDrop
ORDER BY [Database_Name], [Object_Name]

Missing indexes -- revised script

      'create index IX_' +
            replace(replace(replace (equality_columns, '[', ''),']',''),', ','_') +
            ' on ' +
   collate SQL_Latin1_General_CP1_CI_AS + '.' + +
            ' (' +
            equality_columns +
            case when inequality_columns is null
                                    then ''
                                    else ',' +  inequality_columns end +
            ')' +
            case when included_columns is not null then
                  ' include (' +
                  isnull(included_columns,'') +
                  ') ' else '' end + ' with (online = on)' +
            ' -- ' + convert (varchar, avg_user_impact) + '% anticipated impact'
      from sys.dm_db_missing_index_details mid join
                  sys.dm_db_missing_index_groups mig
                        on mid.index_handle = mig.index_handle          join
                  sys.dm_db_missing_index_group_stats migs
                        on migs.group_handle = mig.index_group_handle join
                        sys.objects obj on obj.object_id = mid.object_id join
                        sys.schemas sch on obj.schema_id = sch.schema_id
             database_id = db_id()
                             and            avg_user_impact > 60
                             and equality_columns is not null

      order by, equality_columns, inequality_columns, included_columns

What's going on right now?

Got the following code from the site I'm at today, I'll be using it for a while. The first chunk of code grabs current query plans; the second chunk looks at how memory is being used. Enjoy.

EXEC dbo.sp_WhoIsActive
@filter = '', -- varchar(128)
@filter_type = 'session', -- varchar(10)
@not_filter = '', -- varchar(128)
@not_filter_type = 'session', -- varchar(10)
@show_own_spid = 1, -- bit
@show_sleeping_spids = 0, -- tinyint
@get_full_inner_text = 1, -- bit
@get_plans = 1, -- tinyint
@get_outer_command = 1, -- bit
@get_task_info = 0, -- tinyint
@get_locks = 0, -- bit
@get_avg_time = 0, -- bit
@find_block_leaders = 1 -- bit

SELECT OBJECT_NAME(p.[object_id]) AS [ObjectName],
p.index_id, COUNT(*)/128 AS [Buffer size(MB)],  COUNT(*) AS [BufferCount],
p.data_compression_desc AS [CompressionType], a.type_desc, p.[rows]
FROM sys.allocation_units AS a WITH (NOLOCK)
INNER JOIN sys.dm_os_buffer_descriptors AS b WITH (NOLOCK)
ON a.allocation_unit_id = b.allocation_unit_id
INNER JOIN sys.partitions AS p WITH (NOLOCK)
ON a.container_id = p.partition_id
WHERE b.database_id = CONVERT(int,DB_ID())
AND p.[object_id] > 100
GROUP BY p.[object_id], p.index_id, p.data_compression_desc, a.type_desc, p.[rows]

Monday, March 16, 2015

Fixing VLF issues for all databases on a server

Note: This script presumes that databases are in SIMPLE recovery mode (as it was at this client site at the time). It reduces log file size to minimum, then increases it to what it was before it was shrunk, in one big alter. At the same time, it changes autogrowth to 10% across the board.



select db_name(),(size+127)/128,* from sys.sysfiles where fileid = 2

DBCC SHRINKFILE (N'IMChartData_log' , 2)






SET @cmd = 'USE [?] ;

declare @filename varchar(1000), @dbcc varchar(2000), @alter varchar(1000), @size int

select @filename = name, @size = (size+127)/128 from sys.sysfiles where filename like ''%ldf''

print db_name()

print @filename

set @dbcc = ''dbcc shrinkfile (xx, 2)''

set @dbcc = REPLACE(@dbcc, ''xx'' ,@filename)

set @alter = ''ALTER DATABASE ['' + db_name() +

''] MODIFY FILE ( NAME = xx, SIZE = '' + convert(varchar, @size ) + ''MB, FILEGROWTH = 10%)''

set @alter = REPLACE(@alter, ''xx'' ,@filename)

print @dbcc

print @alter

exec (@dbcc)

exec (@alter)


Exec sp_msforeachdb @cmd