Estimated defects formula
Is there any formula that can estimate the number of defects in the software ?
Re: Estimated defects formula
Might be more than you're looking for but check out:
Since these posts I've been trying to read up on this technique. Can't say I have any real practical experience with it. (Some links in those posts point to other papers that talk about defect formulas.)
Re: Estimated defects formula
Hidden-fault analysis, as mentioned in the post by AitenB, is certainly one way to do it and a very effective way in some cases. The problem is that the method can easily be implemented in an incorrect fashion or it can be applied to systems where there is not enough known about the system as a whole to make a reliable prediction/estimation. In truth, of course, there are a lot of ways to do this and each of them have their own advantages and disadvantages, particularly when considered in the context of a given organization's practices.
You can see one particular example of how people have done this by referencing the article Using Inspection Data for Defect Estimation by Stefan Biffl. (You do have to be a subscriber to IEEE Software to get the paper.) Basically, however, what it does is try to use inspection results to provide information on the likely number of defects. You can see a freely available paper called Estimating the Number of Defects after Inspection (direct PDF link) which, to my way of thinking, goes a little overboard, but does highlight some of the same themes. Yet other people use a variation on "reliability techniques" by using a Rayleigh function. If you want an example of this, check out the QSM Reliability Model. (Note that this link is a direct link to a PDF document.)
Other people like to use statistical methods to predict/estimate the number of defects by using techniques like fault insertion or error seeding. This can be helpful because error seeding involves deliberately introducing defects into the program then measuring the effectiveness of program testing in discovering these defects. By using that, you have a relative idea of how effective your test suites are at finding defects and thus you can at least attempt to predict how many defects are most likely left that have not been seeded. Another technique, along these same lines, is Residual Defect Estimation. This is a technique which is commonly used to explore the efficacy of the test cases for a given product. Again, the product is seeded with a number of "common" errors and the test cases are then run on that version of the product. The number of detected defects is then used to provide evidence of the efficacy of the tests and hence provide evidence of how many errors remain in the real product. (This is also a slight variation on the concept of mutation testing, which is another technique people use to estimate defects.)
However, a lot of times all of this can be much more than you need and provides relatively little return-on-investment for the amount of work it generates. There are also potential problems with some of these methods underestimating the amount of defects or drastically overestimating them. Sometimes you are better simply keeping historical data for various areas of your application that have a certain number of defects and using that as a rough basis for comparison. So while you asked for formulas, I have provided some references for you to consider in that regard, but also realize that sometimes it pays to not reduce certain types of estimation to strictly formulaic approaches unless there is a compelling organizational reason for this to be done.