SPONSORS:






User Tag List

Thanks Thanks:  0
Likes Likes:  0
Dislikes Dislikes:  0
Results 1 to 10 of 10
  1. #1
    Senior Member
    Join Date
    Dec 1999
    Location
    Chicago,Illinois,USA
    Posts
    2,537
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Effective Testing/Testing Effectively

    Evaluating Tests and Evaluating Testing

    There should be a noted emphasis in the above phrase. The emphasis is between evaluating tests and evaluating testing. To take an example, evaluating one test case is generally extremely easy. The product (or that area of the product) either passes or fails the test case. But how do we know that the test cases we ran satisfied the overall purpose of the requirements? In other words, how do we say that the test case (or test cases) demonstrated requirements, found faults/failures, and exercised code?

    To evaluate how well the test cases fulfilled this tripartite purpose, we need a measure of overall test quality. But what does that mean? Well, in general, if we want to measure our test quality we include three elements in our measure: requirements coverage, failure coverage, and code coverage. Notice I said we "want" to do this. I probably should not say that because it seems that this is not what is often being done. To wit, it is the latter two that often get left off by many testers, even though they consider what they are doing "testing". I would disagree, at least if one starts speaking to how effectively they are testing. Even requirements coverage is often done poorly which is odd because it is usually the easiest of the three. So let us consider the easiest first and, after that, the two that are less often done.

    Requirements Coverage
    Generally an effective tester will count how many requirements must be validated. Call this VR (Validated Requirements). Then an effective tester will attempt to trace each requirement to all test cases that exercise that particular requirement. After that, effective testers count all of the test cases that have passed. When all test cases that exercise one specific requirement pass, that requirement is said to have passed. The count of all passed requirements can be called RQ (Requirements Passed). As a final action, and to report on requirements coverage, effective testers calculate a requirements coverage factor by dividing VR by RQ. This is an objective measurement of how thoroughly requirements are demonstrated. (Caveat: One would and should expand this metric to include input domain and output domain coverage because while the above tells you the thoroughness of testing, it does not necessarily speak to its actual comprehensiveness.)

    Fault/Defect Coverage
    Now what about this question: how well did your test cases find failures? Many testers count failures. That is an easy task, however, and a truly effective tester should try to estimate (via prediction or forecasting methods) how many defects (i.e., the causes of failures) were in the product before testing. Why is that a good thing? Think about it and this is a crucial point: When testers do not know how many defects were present to begin with, they cannot make an accurate assessment of how well they have probed for defects or failures. This is such a crucial point that it often surprises me that there is ever debate about this. (Debate about how to do it is one thing and is healthy. Debate about whether it should even be done is another thing entirely, and usually is counterproductive.)

    By a simple extension of this logic, testers could also not know how many defects or failures were (or might have been) left in the product to be passed on to the end-users. If testers cannot predict how many failures customers might potentially find, testers simply cannot evaluate quality in terms of failure or defect coverage. The point is simple: if you did not have estimates of how many faults/failures you might find in a given product, you cannot fully determine the reliability of the test effort because you are lacking crucial information to compare your information with.

    Code Coverage
    Code coverage is also important but is another thing that most testers that I encounter think of incorrectly. They tend to think of "code coverage" as being synonymous with "code testing". That could not be more incorrect. Code testing is all about creating test cases that exercise the code. Code coverage is all about evaluating test cases. There is an entirely different focus shift there. So the idea is (hopefully) to write test cases from requirements specifications. When you run the requirement-based test cases against actual code (whether at a unit or system level), you are then supposed to make measurements to determine which statements and branches have been exercised (covered). That is where coverage comes in.

    Summation
    Evaluating your tests is a lot different than evaluating your overall testing. I have found that test professionals tend to concentrate more on the former than on the latter. I think fault and code coverage (by their correct definitions) are often the thing that is most overlooked in effective testing practices.

    ------------------

  2. #2
    Junior Member
    Join Date
    Sep 2001
    Location
    France
    Posts
    20
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    Hello Jeff,

    Hope I will not be "counterproductive" but it seems to me that either requirement coverage and code coverage can be supported by scrupulous measurements, either estimation of fault presence for fault coverage have to be manipulated very cautiously.

    Reading your arguments, I understand you state defects in a broad sense. It can be code defects, design defects, requirement defects or even environment defects (and probably other types I don't think about).

    At least for the first type (code defects in the sense of uncorrection w.r.t. design or requirement statements) a kind of fault coverage can be achieve using fault injection techniques. Even if in this case "fault coverage" is use synonymously to "test sensitivity", still it is a valuable measure of test quality when done with a real "test the test" objective.

    But of course, it is costly…

    And, at least to my own, I do not know any comparable techniques for the others types of faults.

    Well, finally even if I was surprised at first, I understand that the approach can be fruitful.

    Thanks,
    Fred


    ------------------
    Pourquoi perdre son temps en faisant le minimum ?

    frederic.r.mercier@wanadoo.fr
    Pourquoi perdre son temps en faisant le minimum ?
    frederic.r.mercier@wanadoo.fr

  3. #3
    Moderator Joe Strazzere's Avatar
    Join Date
    May 2000
    Location
    USA
    Posts
    13,170
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    Interesting thoughts, Jeff.

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by JeffNyman:
    As a final action, and to report on requirements coverage, effective testers calculate a requirements coverage factor by dividing VR by RQ. This is an objective measurement of how thoroughly requirements are demonstrated. <HR></BLOCKQUOTE>

    Not that it matters, but I have almost always seen this expressed as RQ divided by VR, so that 1.0 means tests for all requirements have passed.

    I'm not sure I have ever heard this called a "requirements coverage factor", though. If I have tests to cover ALL of the requirements, I would image this would be "complete coverage". But, if some of the tests are failing, this factor would lead one to believe that we have "incomplete coverage".

    In your model, how do you distinguish between "having all the test cases needed to validate requirements" and the state where all those tests have passed?

    ------------------
    - Joe (strazzerj@aol.com)

    [This message has been edited by jstrazzere (edited 05-13-2002).]
    Joe Strazzere
    Visit my website: AllThingsQuality.com to learn more about quality, testing, and QA!

  4. #4
    Senior Member
    Join Date
    Dec 1999
    Location
    Chicago,Illinois,USA
    Posts
    2,537
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    Frederic:

    Definitely not counter-productive at all. I appreciate your thoughts. I think you and I agree about the scrupulous measurements part. You are also correct: in that post I was not distinguishing between different types of defects, just using the broad notion of fault/failure for the purposes of presenting my point. Also: I do agree about fault injection and error seeding methods but those can be costly as you surmise, although often not as costly as something like mutation testing.

    Joe:
    Regarding the terminology of "requirements coverage factor" that actually more comes from the thinking of John Musa and Capers Jones, both of whom used a similar term (although in different ways, I think). And, to be sure, as if we needed further proof: I am an idiot. I meant it the reverse as you stated it (RQ / VR), not (VR / RQ) as I have it. Chalk that up to me writing in a hurry. Sorry 'bout that.

    As far as where you say "If I have tests to cover ALL of the requirements, I would image this would be 'complete coverage'", it may be "complete" in terms of thoroughness relative to the number of requirements. But it might not be the most comprehensive. (I am not saying this is the case for you - just that the distinction is important.) I have seen test cases that cover all the requirements, but did not do so in the most comprehensive fashion that would give greater confidence under a variety of conditions.

    Regarding the comment about "having all the test cases needed to validate requirements", that is part of test case estimation and prediction techniques (partly) but also understand that I tend to use more rule-based requirements documents such that test case generation can initially be automated (in terms of being derived) from the specification language of the requirements document and then amended as might be needed via such techniques as orthogonal arrays, test clustering, equivalence partitioning, etc., etc.

    Overall, I am not sure if I am answering your last question because I am not sure if I am understanding you correctly in terms of the distinguishing elements you would be looking for.

    ------------------

  5. #5
    Points for Confirmed Friends
    Guest

    Re: Effective Testing/Testing Effectively

    I wonder if it would be productive to have some posts on how to do these things? I'm particularly interested in fault/defect coverage. Everyone always asks for "coverage metrics" but sometimes it's hard to determine if what you're providing actually is anything worth looking at. I'm also curious about the distinction between thorough coverage of requirements and comprehensive coverage of requirements. I'm not sure you could measure direct comprehensiveness to requirements. Not sure, though.

    ------------------

  6. #6
    Senior Member
    Join Date
    Dec 1999
    Location
    Chicago,Illinois,USA
    Posts
    2,537
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by AitenB:
    I wonder if it would be productive to have some posts on how to do these things?<HR></BLOCKQUOTE>

    Actually I have found it less productive - at least when I do things like this. The thread languishes, stifles, and then pretty much dies in heart-rending agony. I was thinking it might be interesting, however, for some of us to do TUTORIAL threads. Each post could be as such: "TUTORIAL: {name}". This would facilitate searching. The {name}, of course, is replaced by whatever it is. So you might have: "TUTORIAL: Requirements Coverage" or something like that. This could be a sort of "how-to" post regarding how people actually do these activities.

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>I'm particularly interested in fault/defect coverage. Everyone always asks for "coverage metrics" but sometimes it's hard to determine if what you're providing actually is anything worth looking at.<HR></BLOCKQUOTE>

    Yep. That is always the biggest problem - and that is the case with any sort of metric, really, not just coverage metrics. If you cannot relate the coverage metric to either cost, time, and the basis of effort, I find the metrics not as useful.

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>I'm also curious about the distinction between thorough coverage of requirements and comprehensive coverage of requirements. I'm not sure you could measure direct comprehensiveness to requirements. Not sure, though.<HR></BLOCKQUOTE>

    Usually comprehensiveness is done in relation to how well you have covered the areas of the application. If you have code coverage and you match that to requirements coverage (usually by the mediating technique of test case coverage), then you can easily apply comprehensiveness metrics. Without that bridge, however, you are correct: the metric would be largely useless and, actually, could be quite misleading.

    ------------------

  7. #7
    Points for Confirmed Friends
    Guest

    Re: Effective Testing/Testing Effectively

    My goodness! You have been defeatist of late, kiddo! Tell me you're not burning out!

    I guess what I'm asking is how would you relate that comprehensiveness? Somewhere else here I remember you talked about this but I can't find the post. Here's my situation: I've got reports that basically how many failures we've found. That's great and all but I'd like reports on how well we've actually shown the requirements and how well we've covered the code. So here's what we do: we graph it so that x-axis is time and y-axis is total test case failures. The graph covers a certain test phase cycle. We use the slope of the curve to show the trend of failure finding in this report. If the slope has a sharp rise, we say there lots of failures in the code and that they're being found quickly. That's how we determine that the end of testing is a ways off yet. If the slope is leveling obviously few failures are being found. So we breathe easier. I'm not sure we should be breathing easier, though. And I'm not sure how comprehensive we're being.

    Any thoughts?

    ------------------

  8. #8
    Senior Member
    Join Date
    Dec 1999
    Location
    Chicago,Illinois,USA
    Posts
    2,537
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by AitenB:
    If the slope is leveling obviously few failures are being found. So we breathe easier. I'm not sure we should be breathing easier, though. And I'm not sure how comprehensive we're being.<HR></BLOCKQUOTE>

    Hmmmm. Well, a slope that is leveling on its incline does require a little thought. I would not breathe easier necessarily. I would ask myself if the leveling means that there are actually few failures (so the code is truly high quality), or if the leveling means there are only apparently few failures. Does it, in fact, mean that failures are not being found (and perhaps the test cases are low in quality)? You can not answer these questions just by looking at a report that just shows total failures over time. However, I think you could apply a metric to this to help. To wit:

    You need to show how well test cases cover what they are supposed to cover.

    Okay, so perhaps that is not the biggest revelation you have ever heard. But how do you do it? Well, there are like seven billion ways (slight exaggertion) that you could use to do this. So consider this way: do a weighted summation for test comprehensiveness. This is similar to what I said in the post that I think you are referencing and this translates my above sentence into an actual operational metric:

    C = w<sub>a</sub>f(a) + w<sub>i</sub>f(i) + w<sub>o</sub>f(o) + w<sub>s</sub>f(s)

    C is obviously a measure of comprehensiveness. You are going to have a graph where the x-axis will be time (just like in your failures report) and the y-axis will be C. So the curve you get is obviously a plot of C over time - i.e., a measure of comprehensiveness as testing goes on. Look at the equation for how to form the curve. The equation has four terms that represent action, input, output, and structure coverage. (Except for the coverage aspect, you will see that the first three elements match the notion of your average test case.) Each of these four terms has a weighting value and a ratio. (The w factors are the weighting values.) The sum of the four weight values must equal one. Now, you might want to assign any value to the four weights as long as the sum of the weights equals one. As just one example, suppose functional coverage is very important to the testing team. A large weight can then be assigned to action coverage (which basically is functional coverage) and smaller weights can then be assigned to the other terms.

    The ratio is important. In this case, the top of the ratio factor represents what should be demonstrated by test cases. The bottom of the ratio factor shows what has been successfully demonstrated by test cases. So here are your basic ratios:

    f(a) = passed actions/total actions
    f(i) = passed inputs/total inputs
    f(o) = passed outputs/total outputs
    f(s) = exercised paths/total paths

    The first three factors of C equate to specification-based testing or what you might refer to as requirements coverage. The last term obviously equates to code coverage (and you could use things other than paths if you wanted). When all four terms are reported together, you should have a way to tell that test cases are sufficient (meaning that they have covered what they should). That is, you should have the value C.

    ------------------

  9. #9
    Points for Confirmed Friends
    Guest

    Re: Effective Testing/Testing Effectively

    Actually, when I play around with that a little it seems to work pretty well. Okay, so we're saying that my flattening curves could mean that code had few failures or that test cases were not finding failures. And we're saying that if my C has high values for some measure of coverage, then I'm covering the right stuff. So it sounds like my report for C serves as the basis for my report for F (failures)? Does that make sense?

    ------------------

  10. #10
    Senior Member
    Join Date
    Dec 1999
    Location
    Chicago,Illinois,USA
    Posts
    2,537
    Post Thanks / Like
    Mentioned
    0 Post(s)
    Tagged
    1 Thread(s)
    Total Downloaded
    0

    Re: Effective Testing/Testing Effectively

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by AitenB:
    Okay, so we're saying that my flattening curves could mean that code had few failures or that test cases were not finding failures.<HR></BLOCKQUOTE>

    Well, I am not going to make any comments on your curves! However, that is one way to consider it. That is a possible breakdown that I would at least consider - hence my mentioning it. The trick, I think, is to really interpret the metrics, and ask what the trend means overall rather than just looking at the metric as a snapshot view.

    <BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>And we're saying that if my C has high values for some measure of coverage, then I'm covering the right stuff. So it sounds like my report for C serves as the basis for my report for F (failures)? Does that make sense?<HR></BLOCKQUOTE>

    Yes. Figure it like this: if your graph of C shows high values for test case coverage (one of the factors), then test cases are covering what they are, in fact, supposed to. In that case, the leveling off of the curve in your failure report more than likely means that there are actually few failures in the code itself. (Granted, this is not an absolute. I might apply a confidence interval to this.) Somewhat on the flip side, but using the same logic, if your C graph shows low values, then the leveling off of your curve means test cases are covering the same things repeatedly - without finding failures.

    So, basically, your failure report will show you how many failures were found. The C graph then shows how well requirements were covered and how well code was covered. It is the combination of those two that give you better understanding into your test effort.

    ------------------

 

 

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Search Engine Optimisation provided by DragonByte SEO v2.0.36 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Resources saved on this page: MySQL 11.54%
vBulletin Optimisation provided by vB Optimise v2.6.4 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
User Alert System provided by Advanced User Tagging v3.2.8 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
vBNominate (Lite) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Feedback Buttons provided by Advanced Post Thanks / Like (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
Username Changing provided by Username Change (Free) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.
BetaSoft Inc.
Digital Point modules: Sphinx-based search
All times are GMT -8. The time now is 12:07 AM.

Copyright BetaSoft Inc.