In my career, I’ve had the pleasure of doing a few (dozen or so) database performance studies. Each of the conditions was a little different and so were the outcomes. However, each of them had the same problem: things were too slow.
I often get asked questions about performance tuning and how to make a database run faster. The topic is so broad that the answer usually starts out with “it depends”, which is not very satisfying. However, there are some very standard tests and techniques that are a great place to start. You should start with [most likely to be the cause] or [simplest/quickest to test].
The following is (considered to be) the standard process for troubleshooting performance problems on a data driven app (industry-wide):
*** First, be aware that all of these tests will be more revealing if they are performed while the database is slowest. Naturally, this will make the performance even worse. So please be sure that your users are aware of this and they acknowledge the benefits of performing your tests during such an inopportune time.
- Check your DB indexes – Without good indexes on your DB, there is no hope of getting anything to work faster. Always start here, every time (and repeat as needed). For SQL Server: run the Profiler to gather information about what is happening inside. Gather this information into a text file (not a DB table). Then run the Database Engine Tuning Advisor (wizard) to analyze the information gathered by the Profiler. Keep in mind that the tuning wizard may suggest a LOT of indexes. Each index will use more RAM on your server. If there is already a lack of RAM, then adding indexes will make things much worse. Also, some of the recommended indexes are poor choices. So, you will get better results if you review the list of recommended indexes (and statistics) with discernment and prudence.
- Check resource utilization – CPU (via TaskMan) – If your CPUs are maxed-out, then you are very lucky. This is an obvious problem to fix (in contrast, if your CPUs are never maxed-out, this is a serious problem that is much more difficult to solve). So, if your CPUs are running more than 40% (average, over any 10 minute period of time) or are pegged during a period of obvious slowness, consider one of the following actions:
A) Adding/upgrade your CPUs
B) Optimize code
C) Split your server into multiple servers (half stays, half goes on the other machine)
- Check resource utilization – RAM (via PerfMon) – SQL Server will always try to consume all of the RAM on a server (unless you limit it). You need to check other factors, such as [Page File Usage] (TaskMan), [Available Memory](TaskMan), [Memory-Page Faults/sec], [Memory-Pages/sec], [Cache-Data Flushes/sec]. If any of these show a lot of activity (lots of spikes that go off the edge of the chart) then a lack of RAM may be your problem.
Solution – consider adding RAM or split your database.
- Check HDD usage (PerfMon) – If your queue length goes above 1 very often (more than once per 10 seconds) or if your [% Disk Time] is high when your CPUs are pegged at 90% (but not reaching 100%), then it is likely that you have a disk I/O problem. (btw, I see this problem VERY often. If your CPUs aren’t getting to 100% during periods of slowness, then HDD throughput is usually the cause).
Solution – Here are your choices (by cost):
A) Add more Disk Drives and split your database files (mdf, ldf) onto those other disks. SSDs are wicked-fast, but die quickly under high utilization.
B) Change your disk config. RAID 0 is the fastest. It is even faster when you add more disks. However, it is also more susceptible to failure when you add more disks. So, you might want to go with RAID 10 (which doubles your cost, footprint, energy usage).
C) Upgrade disk bandwidth. Old SCSI3 was fast, but not as fast as SATA3 or iSCSI or FC. Whatever you have, there is always something faster (and more expensive).
* also note: Just because you are using a SAN, doesn’t mean you are using it right. Ask your SAN rep about the fastest config for SQL server (or equiv).
- Check network bandwidth (Taskman or Perfmon) – You should not be using more than 40% of your network bandwidth.
Solution – Oddly, if you have a DB server and a web server (or two) and they have multiple network cards, they will balance the traffic over the cards. When I first heard of this, it sounded like a prank, but it does work. Also, if you use iSCSI, NEVER EVER run your iSCSI over the same network as your TCP/IP traffic. That is bananas. If you can afford 10GB for your iSCSI, then do it (or faster, if it is available/budgetable).
- Optimize code – This is a tough one, because you could spend your whole life on this task. So, cover 1-5 before you dive into this ocean. If you have eliminated 1-5, then this is your last resort.
Solution – There are some simple and obvious things to look for in code. Always look for queries within loops (so evil) or code with dozens of calls to a database. Here is the quick list of fixes:
A) Consolidate DB calls (make one trip to the DB instead of many)
B) Cache any data that reasonably can be cached (or stored in sessions)
C) Consolidate DB calls that look up values and then use them in subsequent queries
If you are left with “Optimizing Code”, let me (strongly) recommend a tool called RedGate Ants. It looks at the different layers of your code and measures exec time for each call and sub-routine. It will spare you the nightmare of spending all of your time trying to optimize a block that is taking 1ms and ignoring a block that is taking 100ms.
Finally, if you are trying to tune a web app, my strongest recommendation ever is to listen to the Optimization Advice from Richard Campbell. This video might be the most important 60 minutes that you have spent this year. I found it deeply enlightening because I used to go straight for DB tuning, but on a web app, sometimes tuning the DB will earn you pennies (of performance) when there are dollars to be had.
Beyond, these recommendations, I can only tell you that things get more specific to your architecture, code and platform. 99% of the time, these six steps helped me resolve any performance problems within a week and increased performance by (at least) 20%, and sometimes, as much as a factor of twenty. Yes 2000%. That is not a typo!
Successful optimization projects feel great. Get an optimization project on your ToDo list (maybe even if you don’t think you need one). They are one of the sweetest joys of being elite.