SQLIO Tips and Tricks for LOB Disk Performance Testing
At the request of one of our blog readers, I wanted to share several tips, tricks, and other basic considerations useful to our technology community (for my earlier post on using SQLIO to baseline performance, see https://blogs.technet.com/b/lobapps/archive/2010/09/03/using-microsoft-sqlio-to-baseline-lob-application-disk-performance.aspx). I’ve grouped these into Use Cases and Runtime Practices.
SQLIO Use Cases
You might wonder if SQLIO is only supported on systems running Microsoft’s Operating Systems or SQL Server databases. In short:
- SQLIO does indeed require a Windows-based host from which to execute. An inexpensive client device is fine in this regard, though. That is, any appropriately configured desktop, laptop, or server that is connected to the set of disks you’re testing is all that’s required. The client device simply needs to “see” the various LUNs configured on the disk subsystem, i.e. mapped drives.
- Thus, SQLIO can be used to baseline a disk subsystem that will ultimately be deployed in a Unix, Linux, or Windows environment.
- Of course, you can’t expose the drive in such a way unless it’s formatted for NTFS…begging the question “why use SQLIO to test this disk subsystem?” when you’re faced with re-presenting and re-formatting LUNs just for testing. In that case, there are other tools you might consider. But once you’re comfortable using SQLIO – a fast, predictable, and very easy method of baselining disk performance – you might find the couple of extra steps not only worth your time but also a way to give you some extra peace of mind.
I also like using SQLIO to not only test but compare disk subsystems to one another. These disk subsystems could eventually host Windows-based Oracle or DB2 databases as well as SQL Server.
· SQLIO can be configured to “emulate” (I’m using this term really loosely) any RDBMS at a low level in that SQLIO can be easily configured to read or write in various block sizes. SQL Server 6.5, anyone? Configure SQLIO to test 2kb blocks. Basic SQL and Oracle testing? Go with 8kb blocks. Want to simulate big-block performance (for backup/restore activities, for example)? Configure for 64kb, 256kb, 1MB, you get the picture….
· The idea is to quantify what the disks are capable of pushing or supporting at max, knowing that your Oracle, DB2, or SQL databases won’t achieve this number (due to overhead and a whole slew of other factors).
· Knowing these maximum numbers will give you the ability to compare disk subsystem changes in an apples-to-apples kind of way. Got a disk controller firmware upgrade coming up? Validate the impact the FW change has (before upgrade vs after upgrade) to get a sense of whether your SQL or Oracle database performance might be impacted.
· Want to compare your how disks hosted on Microsoft Azure or Amazon EC2 (S3, that is) compare to your current SAN configuration? Run SQLIO against them (keeping in mind that you ALSO really need to do other cloud performance and throughput testing out of the scope of this blog posting…).
· Also - and this is hugely important - tests using tools like SQLIO are naturally akin to "smoke testing" tools rather than tools useful for testing online/interactive user performance. Sure, you can change queue depth related to threads (explained in my previous SQLIO post), but this may or may not emulate users well. So when you're looking at your SQLIO test results, think about how those results might reflect hard-running SAP batch processes rather than online SAP users.....
I also find SQLIO useful to compare different database designs against one another. Many of my older SAP-on-SQL customers have deployed R/3 or ECC across 3 LUNs or datafiles (the SAP/SQL installation default), for example, while others have chosen 4, 6, 8, 12, 16, and other numbers. While SQLIO can easily be setup to execute against a single data file (to make for really easy testing), such an approach does not reflect real-world LUN configurations. Thus, SQLIO’s ability to execute tests against multiple data files makes for not only a better baseline test, but a great way to compare the maximum throughput of different configurations.
SQLIO Runtime Practices
Manage your RUNSQL.BAT files. A typical batch file used to execute SQLIO might look like this: sqlio -Fpfile.txt –b8 -o32 –kR -frandom –s60 (execute sqlio, use a parameter file named pfile.txt to point to your datafile or datafiles, use an 8kb block size, use 32 threads [by the way, you will want to reduce this to 4, 8, or 16 if you're running 16 or more datafiles], execute read tests, make these random read tests, and run the test for 60 seconds). Oftentimes I’ll combine 4 of these lines into a single RUNSQL.BAT file (one to do random reads, another for sequential reads, one for random writes, and another for sequential writes), and create different versions based on other factors. For example, I often create a collection of RUNSQL.BAT files configured to execute 8kb reads and writes, and another collection for 64kb. And if I’m testing different SQL datafile configurations (i.e. 3 datafiles vs 6 vs 16), I’ll create a BAT file for each use case.
Avoid busy times when testing. In the best of cases, you want to test your new disk subsystem (or configuration change) while no one or nothing else is accessing it. In the real world, this isn’t always possible. Running SQLIO is EXTREMELY disk intensive, though, so execute it with care when other users are on the system – they WILL notice. And keep in mind that if you’re running SQLIO while another load is concurrently using the disk system, your results will be skewed (lower!), perhaps much lower than what you would have observed otherwise. So do what you can to test off-hours, preferably during a scheduled downtime window or on another non-production disk subsystem identically configured to production (smart customers deploy “staging” or “pre-production” systems to support this kind of promote-to-production change management and testing).
Standardize your data file size. I started using SQLIO back in 1997, when 100GB SAP databases were considered pretty sizable. During that time, I “standardized” my SQLIO testing around creating 5GB data files. Standardizing in this way helped me compare the test results from new customers with those I obtained from previous customers, making it easy for me to compare various disk configurations (hardware models, LUN/group layouts, RAID selections i.e. RAID 1 vs 10 vs 5, and so on). If you’re going to start using SQLIO and see yourself using it for a while, make sure you go with the same size (and number, if applicable!) of data files. Edit the pfile.txt configuration file to reflect the various data file locations and sizes – one line in the configuration file per data file to be used in the testing.
Pipe the output. Instead of running run.cmd from a command line and manually recording the output statistics from each run, pipe the results to an output file which is not only saved to disk but can be later dumped into Excel for some real analysis. For example, from the command line you might enter “run.cmd > output-run1.txt” to dump the results from the individual performance tests specified in your run.cmd into a file called output-run1.txt. On the other hand, you might dump all test results into the same output file (which is both possible and convenient, given that “piped” test run output automatically appends itself to any content already present in the output file).
Test each use case twice or more. Don’t just test once and be done – execute each SQLIO test at least twice to ensure the results are pretty consistent (especially useful if you’re running on a disk subsystem currently being used by other users or processes). And don’t ever take the test results that immediately follow the datafile creation process – it’s naturally (and greatly) skewed towards lower performance.
Don’t forget to test your Log file LUN. Oftentimes we get so focused on verifying our datafiles are optimally configured for performance that we forget about testing the LUN(s) used to host the SQL logs (or Oracle redo and archive logs….). Test them all (using the appropriate block size, i.e. 4kb). And if you really want to do a thorough job, test the LUNs hosting your Windows pagefiles.
Corroborate your results. Like I’ve written in other posts, I’m a big believer in using multiple tools to help ensure my results are fairly consistent. In the past, if I was using a tool like SQLIO, I might also employ NTiogen, SQLIOstress, WinBench, Iometer, or another tool just to give me a bit more confidence in the overall numbers. The testing process then becomes something akin to executing a suite of tests. This seemed especially important when it came to testing completely different kinds of disk solutions (such as local drives vs SANs, or network appliances vs Amazon S3 vs Azure or other cloud hosting models). The more consistency in your output, the more sound your results and recommendations... which should ultimately translate into improved real-world application performance.
As tempting as it may be, don't jump to conclusions too quickly! Once you've started using SQLIO, you'll be tempted to quickly start comparing the results of one test with the results of others. If you've only changed ONE factor between tests, feel free to draw all the conclusions you like. But if you start testing different SANs or disk subsystems against one another, you'll need to keep in mind that things like cache size, cache configuration, the number and speed of HBAs, underlying RAID heuristics, subtle differences in disk controllers, switches, and disk drives, and so much more may (and do!) have potentially a very dramatic effect on your test's results.
I’m sure there’s more….so stay tuned and check back on this posting every now and then. As I think of other SQLIO tips and tricks I’ll continue to append them here…
(added content 10/3 and 10/7)
Have a great week,
george