Selecting test data
A while ago, I tried to get some opions on this subject in the (no more existing) folder about Black-box testing. There was no answers so I give it another try.
When we are selecting test data (values) for black box tests (both during unit tests and system tests) we have used equivalence partitioning for a long time. It is a good technique and we have also (more recently) tried to apply boundary values and domain test selection criteria.
It should be interesting to hear some opions from other people how you get along with test data selection.
Is it common to use a systematic selection method like equivalence partitioning or domain testing, or is it much of ad-hoc (error guessing) methods that apply?
Re: Selecting test data
Björn, I definitely agree with boundary values. Basically breaking up your data into class domains has always been helpful to me - which is pretty much the idea behind boundary values and equivalence partitioning. Personally, I always try to do this rather than relying on "ad-hoc" methods as you state. In my experience, however, most places I have gone were doing just that: "ad-hoc." I do not like this term, however, which is why I quote it, and I prefer the term random. People mistake the definition of ad hoc. It means "for the specific purpose, case, or situation at hand and for no other" and does not mean "random," like many people seem to think.
I usually have to do a big presentation on data integrity issues and what it means to check for data that is valid and invalid. This can apply to simple things like checking if the number of characters in a limited-length field can be achieved and exceeded. But it can also be things like checking one type of credit card number against another type of credit card database. For example, take an American Express Optima and run it against a Discover. But it also goes to permutation/combination checking. Checking various permutations and combinations. A good book out there ("Robust Engineering: Learn How to Boost Quality While Reducing Costs & Time to Market") deals with some ways of looking at permutations/combinations. (I would not necessarily recommend this book in general simply because it is more concerned with hardware or assembly line type topics - but it does cover some things, like orthogonal test casing, loss functions, and other types of general "data" issues - regardless of type.)
Another great book I use for this (and one I might recommend more heartily but with the same caveats as above) is "Improving Data Warehouse and Business Information Quality: Methods for Reducing Costs and Increasing Profits" by Larry English. He does not cover any testing, per se, but rather the benefits of doing good data integrity testing and the general process behind this based not only on his experience but also that of information quality gurus out there.
- - - - -
Erich, regarding exploratory testing I think there can be great benefit to it. A great deal of issues can be found this way. Again, as I mentioned above in my comment to Björn, "ad-hoc" is traditionally defined incorrectly and thus there is a stigma against this that there really should not be. Anyway, my point is that there is, to my way of thinking, benefit to this type of testing and I have found great issues with exploratory types of testing. But I agree with you: both types of testing are needed. A firm base of systematic testing (via test and use cases) is necessary. But exploratory testing can find many little gaps. Also consider that this type of testing can actually help you make your test cases more robust by finding things to add that were not remarked upon in the specification or just did not occur to you at the time of writing the test cases.
Test casing is great but, by its nature, is generally rigid in structure (unless you perform more user pathing type of test case structures) and does not always adhere to the creativity that a human being will bring to a test situation based on events as they happen. But again, those human creative tests that pop up based on circumstance should then be made as test and/or use cases.
Re: Selecting test data
<BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by Bjorn:
Is it common to use a systematic selection method like equivalence partitioning or domain testing, or is it much of ad-hoc (error guessing) methods that apply?<HR></BLOCKQUOTE>
I don't have many specifics to add to this, I just wanted to write some comments on the idea of data picking and in testing in general.
With our data-driven tests, that is testcases that rely on actual data values to operate, we try and do a systematic approach. I find this much more appealling to my "formal" approach to testing. Typical data values are chosen, maxima and minima values, other possible stress points in the application, and then various normal usage type values are used.
But on a more general note about testing, I'd be interested to know how people feel about a more traditional, or formal, approach to testing verses a more ad hoc type test methodology like exploratory testing and the like. I personally feel that an application is best tested when subjected to BOTH types of testing, but I have found that this rarely happens. In my experience, test teams tend to favor one or the other.
I feel that our company is a pretty good representative of good overall testing in that we try and do both. First, a couple of background notes. Our company is small, 60 people, 40 of which are in engineering, and our application is also relatively small. Our QA team does pretty formal testing but again, on a small system, even this kind of testing covers things pretty well. In addition to this, we have "bug hunts" which are company wide bug parties. The company brings in food and beer for everyone, and we spend the afternoon just attacking the system from every which way, trying new things, submitting bugs, etc. This represnts a pretty good "ad hoc" testing scenerio, and has historically resulted in lots of defects being found and subsequently fixed.
Re: Selecting test data
Erich and Jeff,
thanks a lot for your opions
I think we totally agree about the importance of using test techniques that focus both on conformance-directed and fault-directed tests. Just like Erich describe, we are using both formal techniques like partitioning AND randomly (thanks Jeff for the correction - I actually meant random) selected inputs to test both required functionality (including exceptional flows that are stated) and really weird paths in the application.
We have a well-established process to follow on our system-level and we are rather proud to know that our work is well performed.
The main problem that I have been struggling with mostly during the past 2 years, is to improve the technique about unit tests. It takes time to get all developers to understand and to use formal techniques. This is where domain selection tests come in. The technique is not that easy as equivalence partitioning but the result is very good. I have proved on a few software components that it is possible to test all functions and to reach a code coverage about 98-100% with this technique and a few others in combination.
I have based our method descriptions on patterns from the book "Testing Object-Oriented Software" by Robert V. Binder. This book is the best I have read, and I certainly recommend it to other.
Best regards Björn