Friday, December 21, 2012

performance?

A recent forum post pushed enough of my buttons that I chose to respond, and the topics themselves were interesting, so I'm copying the post here.


 

1) Science vs. Art

People frequently define tuning as an art rather than a science, but what is scientific method? You postulate a theory, test it, and repostulate if necessary. That's what we do when we tune.
 

2) What's a valid baseline?

I remember shocking a group of about 100 recent IVY grads at a talk I did a while back when I stated that business is not like a college exam; there is no "right" or "wrong," there is instead "the solution works" or "the solution doesn't work" ... there may be multiple correct (i.e. working) answers.

In this case, you need to be able to answer questions along the lines of "Are all of the potential bottlenecks on our system wide enough to accommodate the current expected usage and identified expected growth?" If the answer is yes, you're in good shape. If you're looking for harder numbers, my personal preference is not to exceed 70% of resources at normal peak demand; if you are, then you've got room for unexpected peaks; if you're above 80%, you need to tune or shop. (this is a VERY general rule and needs to be applied with knowledge of your system of course.)

If you can't answer these questions you need to reexamine your work so that you can do predictive analysis

3) Where do we start tuning?

About 6-7 years ago, after doing little but tuning for 15 years, I've changed my approach. Before interviewing users about their issues, or looking at perfmon which gives us a snapshot view of a very tight time slice, I install a tool (the one I currently use is Confio Ignite -- subjectivity warning, here, as we resell this tool, but I examined a dozen others before choosing Ignite). After the tool starts collecting, I interview the users about what they think their problems are, and subsequently start looking at what the tool is collecting, starting at a macro scale and moving from there to the micro scale based upon findings at a higher level; this may lead me to anything from disk issues to specific queries run by specific users at specific times of the day.

 

Thursday, November 1, 2012

Shrinking the transaction log

Every once in a while I look something up & want to keep referencing it... this article is one of those, I hope you find it as useful as I did:


Tuesday, October 16, 2012

Transaction Log Space Utilization


A client asked me to investigate log space utilization on a specific database . It was  currently allocated at 53 gig, and the question was whether that was big enough or too big. I gave a pretty generic answer, which is that it needs to be big enough to contain all active transactions between dumps & during replication, plus a fudge factor. There’s a GREAT article here:
 
http://www.mssqltips.com/sqlservertip/1225/how-to-determine-sql-server-database-transaction-log-usage/

it also shows some loosely documented /undocumented commands to take a look.

At the time, they were using .22% (Yep, less than 1%) of the 53 gig transaction log. It’s impossible for me to tell how big it needs to be (without a deep understanding of the application, but a potential guideline is 25% bigger than the biggest transaction log backup you have; so, if those are in the 1-2 gig range, they had a lot of room to spare here.

Note that with SQL 2005 (their version), you can not shrink the log file below its initial size (this is corrected in SQL 2008 & later).

Monday, October 1, 2012

SQL Saturday

My most common response to Orlando's SQL Saturday "What could the speaker do differently to improve?" was "Give him more time!"

Thank you all who attended, it was a pleasure meeting you in person. I'll be doing the SAME seminar in Nashville in 2 weeks... check out www.sqslsaturday.com and register!

Along these lines, we do webinars about monthly, send me an email if you'd like to be added to the email list. In Orlando, a gentleman came up to me & claimed my webinar changed his life... your mileage WILL vary, but there's always lots of great information.

Thank you all again for coming out to see us!

Jeff

Thursday, September 27, 2012

SQL Saturday 151 in Orlando

I'm speaking this Saturday, September 29, 2012, please be sure to come over & say "Hi" in our booth... register at www.sqlsaturday.com


Saturday, September 22, 2012

Speaking at PASS DBA Virtual chapter on Wednesday!




 

Registration is not required to attend the meeting but if people want to be included in the drawing for a $50 Amazon Gift Card, they must register no later than 5:00 PM Eastern on Tuesday, September 25th.

Thursday, August 30, 2012

Dynamically create missing indexes frm DMVs

Here's the code. Note: DO NOT simply runn this and apply all the indexes; look for redundancy first, there's likely to be a lot. A couple of notes:

  1. Make sure to change the database name in 2 places
  2. Decide how many of these you want. I (somewhat arbitrarily) set an 80% impact threshold on what I'm bringing back; you may want to adjust this.


use YourDatabaseName

 

select

      'create index IX_' +

            replace(replace(replace (equality_columns, '[', ''),']',''),', ','_') +

            ' on ' +

            sch.name + '.' + obj.name +

            ' (' +

            equality_columns +

            case when inequality_columns is null

                                    then ''

                                    else ',' +  inequality_columns end +

            ')' +

            case when included_columns is not null then

                  ' include (' +

                  isnull(included_columns,'') +

                  ') ' else '' end +

            ' -- ' + convert (varchar, avg_user_impact) + '% anticipated impact'

      from sys.dm_db_missing_index_details mid join

                  sys.dm_db_missing_index_groups mig

                        on mid.index_handle = mig.index_handle          join

                  sys.dm_db_missing_index_group_stats migs

                        on migs.group_handle = mig.index_group_handle join

                        sys.objects obj on obj.object_id = mid.object_id join

                        sys.schemas sch on obj.schema_id = sch.schema_id

                       

                        where db_name(database_id) = 'YourDatabaseName' and

            avg_user_impact > 80

      order by obj.name, equality_columns --avg_user_impact desc

Thursday, August 16, 2012

SQL Server World User's Group

I'm speaking again at their virtual conference... to sign up, go to:

 https://www.vconferenceonline.com/event/regeventp.aspx?id=661

If you do sign up, please use the tracking code to let them know you found out here: VCJEFFREY



 

Tuesday, August 14, 2012

Indexes not used since the last reboot


This query gives you a list of indexes that have not been used since the last reboot.

Notes:

1)     I’m excluding primary key indexes. Amazing how often they’re listed (i.e. primary access is not via primary key).
2)      I am running this for a specific database (see the “database_id” column in the where clause). You can identify a database’s database id by running the command:
a.       Select db_id(‘insert database name here’)
b.      Or, if you want to go the other way,
                                                               i.      Select db_name(‘insert database id here’)
3)      You get three columns, table name, index name, and the drop command to get rid of the index. It’s sorted by table name / index name
a.       It might be erring on the side of constructive paranoia to generate a create script for each of the indexes you’re about to drop before dropping them, in case circumstances require a quick recreate.

index drops of unused indexes


select
      object_name(i.object_id),
      i.name,
      'drop index [' + sch.name + '].[' + 'obj.name' + '].[' + i.name + ']'
from
      sys.dm_db_index_usage_stats ius join
      sys.indexes i on
            ius.index_id = i.index_id and
            ius.object_id = i.object_id join
            sys.objects obj on
            ius.object_id = obj.object_id join
            sys.schemas sch on
            obj.schema_id = sch.schema_id
where
      database_id = 8 and
      user_seeks + user_scans + user_lookups = 0 and
      i.name not like 'PK_%'
order by
      object_name(i.object_id),
      i.name


Wednesday, July 25, 2012

Synchronous Mirroring

There's plenty written about database mirroring already, a cool new feature of SQL which enables you to create a hot standby (read: automatic failover) for a database on your server (read: NOT the whole server)...


One tip: If you are mirroring across any significant distance, and require synchronous mirroring, take lag time into account, as you will nto commit for the 30-90 ms round-trip time it takes to push the change out to the target server.

Free Database Tuneup

Hi,
We're offering a free, one-hour tuneup of your target database server. In order to take advantage, go to:
www.confio.com/sec
to download monitoring software, then email me (jeff@soaringeagle.biz) for a trial key, and we'll set it up.
No cost, no obligation, but we do find that most of the folks who take advantage fall in love with the tool or the service...


Enjoy,
Jeff Garbus

Data type mismatches causing SQL Server performance issues

I've been talking about this for years, but every once in a while I see a nose rubbed in it, and the results can be dramatic.

The SQL Server optimizer, in some versions (this is SQL 2005 SP3) will not properly resolve statistics if the data types of the declared variables & the compared columns don't match.

The solution: Match up your data types.

In the code sample below, I was looking at a query that was accessing SQL Server from Ruby--on-Rails. The ONLY difference in the code is that in the first example, the data type is being passed in as nvarchar(4000), and in the second exanmple it's varchar(255) the actual data type.

You can see that the relative plan costs are 99% to 1% (actual costs are more like a factor of 10,000).

Code (reproduced with the kind permission of my client):

declare @p0 nvarchar(4000)

SELECT this_.clinicalContactID as clinical1_138_0_,
this_.clinicalContactTypeID as clinical2_138_0_,
this_.socialSecurityNumber as socialSe3_138_0_,
this_.contactNumber as contactN4_138_0_,
this_.firstName as firstName138_0_,
this_.lastName as lastName138_0_,
this_.middleName as middleName138_0_,
this_.dateOfBirth as dateOfBi8_138_0_,
this_.age as age138_0_,
this_.gender as gender138_0_,
this_.raceTypeID as raceTypeID138_0_,
this_.maritalStatusTypeID as marital12_138_0_,
this_.createdOn as createdOn138_0_,
this_.createdBy as createdBy138_0_,
this_.lastUpdated as lastUpd15_138_0_,
this_.lastUpdatedBy as lastUpd16_138_0_
FROM ClinicalContacts this_
WHERE this_.socialSecurityNumber = @p0

Plans:




Thursday, June 14, 2012

5 SQL Tricks for making code disappear




Ask your CIO – the cost of maintaining a $1M project over the life time of a project is about $7M. This means that the less code you write, the less expensive it is to maintain your application Here are 5 quick tips for reducing the amount of code you have to write.



1) Try/Catch logic blocks

SQL Server 2005 introduced the “Try/Catch” code block. Instead of checking the value of @@error after every data modification statement, you can put all of your dml in a “try” block, and handle any exceptions in a “Catch” block. Note: This is NOT transactional, you’ll still need to manage transactions externally

2) Common Table Expressions (CTE)

We’ve seen performance both ways: CTEs can both eliminate performance issues or create them. That said, when properly implemented, we’ve also used CTEs to avoid some very extensive and complicated recursion logic.

3) File Stream

For those of you who are taking .doc, .pdf, .jpg, etc., apart & putting them into varchar(max) or varbinary(max) columns are not only working too hard, you’re using up too much storage. Databases don’t store that information efficiently. Instead, use the new (with SQL 2008) Filestream data type, store it in the database as itself.

4) Merge

a. Data warehousing (and keeping cubes, etc. up to date) has caused us to need to write a lot of “take the rows form this table that have been updated, update the corresponding rows from over there… unless there isn’t a row over there, in which case add a row” logic. Hey, we’re programmers, that’s what we do. The catch is, you don’t’ have to work that hard anymore. The new “Merge” statement even gives you error-processing.

5) Hierarchy data type

We’ve personally written a lot of code to define and manage hierarchy structures. We have one client in particular who might be able to eliminate as much as 75% of their code (all they do is manage hierarchies) by using the new hierarchy data type.