The Brute Tester
A co-worker of mine was working on a tool to watch our release share for the appearance of a test result file, for which he would then publish up to a TFS server. The "trx" results file in question is actually being created on the Xbox console by our custom test harness, and historically was around a 10 megabyte (Unicode) file for a typical daily regression run. Yet on a Friday afternoon (when it actually happened to be nice outside), I get the following email from the tool writer:
Any idea why daily\testResults.trx is 1.10GB? It’s too large to load and publish.
I'm thinking to myself: "One gig? Ay corumba! Did something go into a bad loop or something?"
I go out to the release server and start digging through the various log files, and sure enough, that thing is huge. Luckily we also produce a little rollup by component area which is written out to a text file and only contains the totals. The previous day's totals came out to 6422 tests for this suite. Today's number: an unorthodox 454,216. Yea, I guess adding 447,794 test cases will definitely blow out the size of a TRX file.
Now to dig a little deeper. The four hundred and forty-seven thousand test cases were just in one feature area. Looking at the test code it was definitely following a brute-force pattern over multiple axis, the first instance I noticed was O(n3).
Cranking through half-a-million test cases for a single feature is probably not the best use of your time (or the computer's). And I would hate to be a developer asked to run this suite in order to isolate a failure down deep in a nest of loops ("sorry Joe Developer, but it starts to fail around the 300,000th test case, good luck with that breakpoint"). Instead you should be trying to pick the fewest (reasonable) cases that adequately test the component. Here are some ideas to get you started:
· Use Equivalence Classes/Partitioning. You can even use Code Coverage to help determine groups.
· Use All-Pairs or Pairwise pruning techniques (Microsoft's PICT tool can be downloaded for free).
· Use data generation (e.g. Fuzzing) or Data-driven techniques to increase your coverage over time. You don't need to run every single variation all at once, instead you increase your surface area a little bit each day.