| || |
[TOOLBOX] Six Sigma: Valid for Software?
There has been much talk about this and it is a good topic because Six Sigma is one of those gray areas for a lot of people. So if readers will permit me a somewhat lengthy post, I would like to expound upon some thoughts on this topic. I would encourage others to share their experiences and thoughts.
In order for any process capability to accurately be calculated in a Six Sigma sense, one must properly define and quantify the process defects, units, and opportunities. Every process should have agreed upon definitions for these terms. But, before we go into that, you also need to understand the needs of your customers. During the process of gathering customer information, you will translate those needs into issues, service level agreements, and specifications of various sorts. From these artifacts comes the customer CTQ (Critical To Quality). In Six Sigma this a product or service characteristic that must be met to satisfy a customer specification or requirement or service level agreement. With that in mind, let us go on to define the terms I just mentioned.
Define Your Product/Service Defects
A defect, based on what I just said above about customer requirements, is probably going to be defined as any part of a product or service that: (1) does not meet customer specifications or requirements, or (2) causes customer dissatisfaction, or (3) does not fulfill the functional requirements. (In hardware you would also include the physical requirements.) It should be noted that the term "customer" refers to both internal and external customers or you can choose to make a distinction between customers and end-users or whatever you want.
Define Your Product/Service Units
A unit is something that can be quantified by a customer. It is a measurable and observable output of your business process. It may manifest itself as a physical unit or, if dealing with a service, it may have specific start and stop points. (When I worked at a company called FullAudio, for example, it was the duration of a subscription. At the Chicago Board of Trade, it was the length of the trading day and the length of the clearing day. It would also apply to specific things, like options and futures, such as corn futures or soybean oil options. These all had durations and attributes in the software.)
Define Your Product/Service Opportunities
Simply stated, opportunities are the total number of chances per unit to have a defect. In Six Sigma in the "manufacturing sense", each opportunity must generally be independent of other opportunities and, like a unit, must be measurable and observable. The final requirement of an opportunity is that it directly relates to the customer CTQ. The total count of opportunities indicates (in one sense) the complexity of a product or service. Note, however, that in the software world some opportunities are not independent of others, hence cascade failures. However, you can break things up into failure and fault (instead of just the term "defect" or "bug") and then apply the same Six Sigma approach in that fashion.
A Simple Example
Let us say your manager comes rampaging over to your desk and, with froth flying from his mouth, bellows: "Our documentation is horrible! There are so many errors that customers do the wrong thing but because we told them to. This has to change and has to do so yesterday! Get on it." So here our CTQ is documentation quality. The measure (relative to this CTQ) is number of mistakes. The specification (relative to this CTQ) is zero mistakes. A defect, in this context, would thus be any mistakes: i.e., anywhere that it is the case where the documentation text describes something that does not exist in the application or describes a way of working with the application that is faulty. The unit here could be taken as a paragraph. The opportunity is all of the paragraphs in your documentation.
So let us say that you set about testing the documentation and you measure two defects. That means you have found two instances in your documentation where the descriptions of how to use the application are at variance with how the application actually works. Of the number of units we have (the number of paragraphs), we might have 100,000. (Hefty documentation; but I can assure you that the Board of Trade documentation was actually much higher than this.) The opportunities for problems was one per paragraph. (I am simplifying here. It is possible that a paragraph could contain numerous errors and if that is possible, one has to account for that.) So in this case our sigma is 5.6. (Or 5.607 if you want to be exact.) That assumes a sigma shift of 1.5.
But, Jeff, what the heck does that mean?!?
In Six Sigma the "shift" refers to the amount by which processes or products will "shift" (move) from the desired average. The shift refers to the process drift or product drift that will occur relative to your measures. I could give more details but I am not sure how interested people are plus I still have to make sure I am making sense here. However, keep in mind that with Six Sigma there is a sigma long-term value and a process sigma short-term value. I am referring to the latter here.
Okay, sounds wonderful, but what are we actually measuring here?
Keep in mind that we are looking at a process measure here. In Six Sigma-speak that refers to a measure related to individual steps as well as to the total process. If it helps, think of it is a sort of predictor of final output measures. (Actually, looking at it that way never helps me. But I mention it for those of you who have are purists and have gone through Six Sigma training.) The idea is looking at process capability which is a determination of whether a a process of normal variation, is capable of meeting customer requirements. It is a measure of the degree a process is (or is not) meeting customer requirements compared to the distribution of the process by which those requirements are being satisified. (Read that again and think about what it means.) Note that I say "normal variation" here. Six Sigma is there to help you calculate normal variation and then calculate your standard deviation (variation) in terms of monitoring process. In strict Six Sigma what this means is that you have a statistical concept indicating that a process is operating within an expected range of variation and is being influenced mainly by what are often called "common cause" factors. Variation in this context is described as change or fluctuation of a specific measurable characteristic which determines how stable or predictable the process may be. (The more predictable a process is, the more confidence you will tend to have in it.) This variation may come about because of environmental concerns, people-related concerns, equipment or lack thereof, methods/processes in place or the extent of their use, measurements being taken, etc. Process improvement should reduce variation to acceptable levels (confidence levels) and Six Sigma is a way of helping you determine if that is happening and, if so, to what extent.
Ummm, yeah. But a shift of what?!?
The "shift" is a very important concept in Six Sigma and I think it is, in fact, one element that makes it suited to non-unique environments like software. Mind you, this is a charge that is often leveled against Six Sigma's use in software. So why do I feel this argues for its use?
The "shift" in the process comes from the statistical sense that the process has had an assignable cause of some sort forcing the process to deviate from either its predicated or hoped for average or from its historical mean by one of several recognizable trend patterns that Six Sigma tends to advocate. Even if you do not adhere to the trend pattens, the point is that all of these are a shift from a stable process. The shift that you hear so much about is the natural movement in the process center from its target (goal). This natural movement is not due to any one particular systematic cause(s), but generally due to combinations of small (sometimes contextually random) causes over time that can vary not only from company to company but from project to project and even from process to process within one project. As many have suggested any sort of process will tend to vary or shift from its target over a long period of time by a certain amount. What Six Sigma does is allow you to get a little more granular and try to nail down specifics relative to a specific project so that you can implement corrective actions.
Re: [TOOLBOX] Six Sigma: Valid for Software?
First, I do want to make clear to those who may not be familiar with Six Sigma that it is NOT only a statistical concept, but a complete structured process itself. As I learned it, it has phases of "Design, Measure, Analyze, Improve, and Control". Unfortunately, the conversation (including the rest of this posting) seems to always center on the numbers.
As I have mentioned elsewhere, I believe that there is a place in software development for statisitcal analysis, but do not believe that following a precise Six Sigma definition is appropriate.
The problems start at the beginning -- what is an "opportunity". In the example noted above, if we chose "pages" instead of lines, our analysis comes out "worse"; if we use "characters", the stats come out more in our favor. There are no tangible widgets in software, so any choice of a definition of an opportunity is arbitrary.
Likewise, what is a defect? With the widgets, there is an absolute, physical measurement that can be made: a widget is definitely inside or outside a spec and acceptable variation. Is a typo less of a variation than flawed logic? Again, there is a sense of being arbitrary.
Also, if we do speak of "bugs", is it not generally accepted that it is impossible to find all of the bugs in any piece of software? If so, are we beginning to speak of probabilities of probabilities? If we are a Six Sigma process, are we saying that we have 3 defects per million opportunities or are we saying that we found the one defect that infers that we probably have three?
Finally, Jeff, please clarify what you mean by "non-unique environments like software". I think I agree with your point about shifts, I just want to make sure.
Re: [TOOLBOX] Six Sigma: Valid for Software?
<font size="2" face="Verdana, Arial, Helvetica">Six Sigma is often called a methodology, one that uses data and statistical analysis to measure and improve an operational performance (for a project, product, or process) by identifying and eliminating defects. I, and many practitioners, prefer to consider it as a framework. In fact, it is a three part framework: metric, methodology, and philosophy. The philosophy is the reduction of variation in your business processes and the adherence to customer-focused, data-driven decisions. A methodology for this is often given as DMAIC (Define, Measure, Analyze, Improve, Control). The metric is often given as DPMO (Defects Per Million Opportunities). However, keep in mind we can define opportunities contextually.
Originally posted by lemboja: First, I do want to make clear to those who may not be familiar with Six Sigma that it is NOT only a statistical concept, but a complete structured process itself.
<font size="2" face="Verdana, Arial, Helvetica">The process controls in Six Sigma are predicated upon measuring the process. Measurements are predicated upon numbers. Practitioners and advocates will tell you that the variation is measured from the mean in Six Sigma. That requires numbers. More specifically, the fundamental and core objective of the Six Sigma methodology, as it is often given, is the implementation of a measurement-based strategy that focuses on process improvement and variation reduction through the application of two concepts: DMAIC (Define, Measure, Analyze, Improve, Control) and DMADV (Define, Measure, Analyze, Design, Verify). Note that these two processes are used at very different times and for different reasons. As I learned it, it has phases of "Design, Measure, Analyze, Improve, and Control". Unfortunately, the conversation (including the rest of this posting) seems to always center on the numbers.
<font size="2" face="Verdana, Arial, Helvetica">You would have to define for me what you mean by a "precise" Six Sigma definition. If you mean applying the exact same models used in manufacturing to software, I would agree. The same arguments were used when Six Sigma moved from manufacturing to transactional services. People said it could not be done. But it was done because people took the core framework and utilized it in the new context. Now it is being applied to software and you have a whole new group of people saying it cannot be done. Perhaps they are right; but I have yet to see proof of that. I would also have to know more about the basis upon which you are making this decision or why you have this belief. You say that you "do not believe that following a precise Six Sigma definition is appropriate." It sounds like you have tried it and found it to not work. If so, I would be very interested in discussing the environments you were in and comparing them with my own experiences. As I have mentioned elsewhere, I believe that there is a place in software development for statisitcal analysis, but do not believe that following a precise Six Sigma definition is appropriate.
<font size="2" face="Verdana, Arial, Helvetica">Of course. Like any metric that can be the case. But one character, in the example I gave, is probably not going to be enough to convey a whole thought that is incorrect regarding what the application does. A page would be too high-level because multiple defects could be on a page. A paragraph would be required to describe things at that level. You could even lower that to a sentence, if you wanted. You could, however, choose a "character" or "word" if you were looking for spelling mistakes. But that is a different measure and that means the number of opportunities goes way up because there are many more opportunities to mistype a word in documentation than there are to convey an incorrect statement about the workings of the application. That is why I was careful to say that people have to agree upon the definitions of the core Six Sigma terms relative to their project. This is something that Six Sigma, as a general framework, advocates. The problems start at the beginning -- what is an "opportunity". In the example noted above, if we chose "pages" instead of lines, our analysis comes out "worse"; if we use "characters", the stats come out more in our favor.
<font size="2" face="Verdana, Arial, Helvetica">Common controls. Internal re-usable code modules. Third-party compiled code. Licensed graphics engines. Firmware. These are all examples of what one could consider "widgets" in software. The definition of opportunity is not arbitrary, necessarily. It is contextual. As I said, opportunities are the total number of chances per unit to have a defect. Now I agree that it can be hard to measure this in terms of software. Because you might say: well, how many defects is it possible for a given module to have? That is where defect prediction models might come in handy or that is where you use historical data that you have at hand. Or that is where you use the very common technique of software defect forecasting. But you could also break things down by function points or by use case points or class points or object points or any of the other "counting" methods and apply things that way. Each of those could be your unit. There are no tangible widgets in software, so any choice of a definition of an opportunity is arbitrary.
<font size="2" face="Verdana, Arial, Helvetica">I defined a defect relative to my example. A defect is any statement (sentence or paragraph) that conveys incorrect information about the application. That is an absolute "measurement" because either the statement in the documentation is correct (i.e., reflects reality) or it is not correct. Here you have very little "acceptable variation" unless you start to consider things like wording that is confusing but not necessarily inaccurate. That is where a confidence level comes in. And that is where applying a strict manufacturing mindset to the problem would do you more harm than good. Likewise, what is a defect? With the widgets, there is an absolute, physical measurement that can be made: a widget is definitely inside or outside a spec and acceptable variation.
<font size="2" face="Verdana, Arial, Helvetica">There is a sense, rather, of being contextual. A "typo" and a "flawed logic" are talking to two different units: documentation and code. I made it clear that you have to make sure your measures are relative to the appropriate unit. And there is nothing wrong with that sort of contextual measuring. Good practitioners in the test industry know how to do problem-focused process development. That means you can tailor processes to suit your needs based on the problems you are dealing with in a given context. But in this case you are not even really doing that. You are simply doing what Six Sigma advocates: defining units, defects, and opportunities. Is a typo less of a variation than flawed logic? Again, there is a sense of being arbitrary.
<font size="2" face="Verdana, Arial, Helvetica">I am not sure how the one thought follows from the other. Yes, it is generally considered impossible to know if you have found all of the bugs in a given piece of software. Also, if we do speak of "bugs", is it not generally accepted that it is impossible to find all of the bugs in any piece of software? If so, are we beginning to speak of probabilities of probabilities?
<font size="2" face="Verdana, Arial, Helvetica">What do you mean if "we are a Six Sigma process"? That wording is a little odd to me. The general process yield is calculated by subtracting the total number of defects from the total number of opportunities, dividing by the total number of opportunities, and finally multiplying the result by 100. What you are talking about is a measure that is contextual based on how you have defined your terms. Quoting from my article: If we are a Six Sigma process, are we saying that we have 3 defects per million opportunities or are we saying that we found the one defect that infers that we probably have three?
<font size="2" face="Verdana, Arial, Helvetica">And this sort of goes to what you said at the beginning here. Six Sigma is more than just numbers: it is more about process capability. But that capability is measured in terms of various numbers (measures). Remember that Six Sigma, as a general framework, is there to reduce process output variation so that plus-or-minus six standard deviations lie between the mean and the nearest specification limit. What this means is that the process will allow no more than 3.4 defects per million opportunities.
When an automobile is described as "Six Sigma", this does not mean that only 3.4 automobiles out of one million will be defective. Six Sigma means that within a single automobile, the average opportunity for a defect of a critical-to-quality characteristic is only 3.4 defects per million opportunities for such a defect to occur. The more complex a product is, the greater is the likelihood that a defect will exist somewhere with the product. So while a complex automobile may have more defects per unit than, say, typical audio speaker, at the "opportunity" level, the audio speaker and the automobile can easily have the same sigma capability. So rather than stating that a product is Six Sigma, we say that the average opportunity for non-conformance within a product is Six Sigma. It is a measure not of the product, but of the performance of the product.
Remember not to confuse a specification limit with a control limit. Specification limits describe what you want the process to produce. You pick these based on your customer needs. You can call that arbitrary if you want; I call it contextual. Control limits describe whawt the process is actually producing. These are elements that are a function of the process itself. They are what you observe, not what you set or determine.
But note that you can use any defect measure you want. Six Sigma for manufacturing is done with DPMO. But that does not mean you have to do it that way. But also remember that DPMO just indicates how many defects would (probably) arise if there were one million opportunities. If that is too many for you, make it a thousand opportunites. DPMO is just one defect measure; you are free to use your own with Six Sigma. The general formula is:
Number of Defects / (Number of Units * Numer of Opportunities)
You can then measure DPO (Defects Per Opportunity) via whatever measure you would like in terms of numbers.
<font size="2" face="Verdana, Arial, Helvetica">I was actually referring to your point from the other thread. You had mentioned that you tended "to agree with Bender that software development is too 'unique'" and you had indicated that "the problem, the developers, the tools, the environments, etc. all change too frequently to make many comparisons valid." So I was speaking to that general point. Software has a lot of non-unique components I would maintain when you look at it from the standpoint of units to be tested or considered, processes to be implemented, measures to be taken, and variations to be analyzed. While elements within an organization will vary and will certainly be unique, much of the Six Sigma framework can be applied in a very similar fashion. (The distinction in Six Sigma is between "common causes of variation" and "special causes of variation".) One could make the same argument for other processes that exist as well. I think part of the problem is that everyone thinks software development is so unique and yet studies like design patterns and software engineering practices suggest that there are a lot of common grounding elements to the discipline that make it amenable to something like Six Sigma. Finally, Jeff, please clarify what you mean by "non-unique environments like software".
Another way to say it, perhaps, is that Six Sigma advocates a framework for the application of both strategy and tactics. The strategy is the plan behind the whole process improvement effort. Tactics refers to the methods for individual improvments at a granular level. While the specifics of the strategy and the tactics are crucial to the outcome, it is the underlying rules that are the key to success. Strategy and tactics must be designed to take advantage of the wisdom embodied in the rules. No single rule is ever applicable in any given situation; rather the rules offer guidelines for finding solutions in whatever circumstance one finds themself in. That is the unique aspect. A good practitioner evaluates every new situation and then chooses the rule or rules that must be employed to achieve success/process improvement. That is, to me, the non-unique aspect. A good framework provides the means by which one can choose certain rules and that can be then be utilized by the framework as part of the strategy and tactics. That is what Six Sigma at least potentially offers in my view.
Re: [TOOLBOX] Six Sigma: Valid for Software?
Just a quick response for now.
I am uncomfortable with the idea of using defect predicition models and the like as the source of data for the measures. When the general formula is:
and we use a prediction model to count the defects and a complexity model to count the opportunities, I'm not sure if we're really saying anything reliable.
Also, lest I inadvertantly misrepresent myself, I am a recovering developer struggling in the metamorphosis to a QA person. My QA experience is more academic than practical at this point. I appreciate everyone's patience and willingness to educate and debate.
Re: [TOOLBOX] Six Sigma: Valid for Software?
<font size="2" face="Verdana, Arial, Helvetica">Then you certainly do not have to. You can instead use empirical data you derive or you can use historical data or you can use both in tandem.
Originally posted by lemboja: I am uncomfortable with the idea of using defect predicition models and the like as the source of data for the measures.
<font size="2" face="Verdana, Arial, Helvetica">Okay, but why do you feel that way? It is hard to answer to something if am uncertain of the motivation behind the thought. With that said, however, perhaps you will permit me another somewhat long post. Let me see if I can clarify my thinking more and perhaps answer your concerns, at least to some extent. When the general formula is: defects/(units*opportunities) and we use a prediction model to count the defects and a complexity model to count the opportunities, I'm not sure if we're really saying anything reliable.
= = = = = = = = =
Let us say that an Internet Service Provider measures their performance in uptime of available servers. Every minute of potential uptime ("the service is available") is an opportunity, every minute of downtime ("the service is not available") is a defect in the eyes of a customer. Data is continuously taken (i.e., the server farm is constantly monitored), the process capability (to maintain the server farm) is measured, and the yield is calculated to be 99.9%. The Internet Service Provider is satisfied with their current performance and the customer's needs are being met. (Note here that while one could consider this a "hardware" example, one could equally argue for a "software" example and one could certainly argue it as a service-based example.) But now how do we apply this to Six Sigma? Well, let us ask a couple of questions.
What is an opportunity here?
An opportunity is the lowest defect noticeable by a customer. An opportunity could be defined as sixty seconds of uptime. Say that this was determined to be the lowest (shortest) time period that was noticeable by a customer. Or one might argue instead for twenty seconds. Or even three seconds - about the perceptual limit with most activities on a Web-based platform given other technology limitations, such as perceptions of browser activity.
What is a defect here?
A defect is defined by the customer as sixty seconds (or twenty seconds or three seconds) of downtime. An additional defect would be noticed for every time unit that elapsed where the customer did not have access to the Internet Service Provider's services (i.e., the Web servers).
Now that you have clear definitions of what an opportunity and defect are in this context, you can measure them. Sometimes you may need to set up a highly formalized data collection plan and organize the process of data collection. Other times the means of collecting data are either done automatically or are easily demonstrable. Either way, you need to gather reliable and statistically valid data. So the Internet Service Provider may say that opportunities during the year was 525,600.88 minutes in a year. Or you could measure it as 31,535,992 seconds in a year. In what follows, I will measure via seconds. Let us say that the defects noticed (remember: per time unit) were 172,799.61 seconds. (That would be about the equivalent of two days. So in a year's time the downtime was equivalent to two full days even though we are not saying it was actually two full days in a contiguous fashion.)
The process yield is calculated by subtracting the total number of defects from the total number of opportunities, dividing by the total number of opportunities, and finally multiplying the result by 100. So you have:
[(31,535,992 - 172,799.61) / 31,535,992] * 100
That is 99.45% So your Sigma is between 4.00 and 4.10. In terms of defects per million opportunities (if you chose to measure that way), this would mean your defects if you had a million opportunities would be between 6210 (for Sigma 4.00) and 4660 (for Sigma 4.10). Your actual values would be a DPMO of 5479, the percentage of defects would be 0.55%, and your actual process sigma would be 4.04. Note, also, that I have assumed here a standard sigma shift of 1.5. Obviously if you use a different shift your results will be different. That brings up a question.
Why might you use a different shift?
If you are in a highly variable environment, you can increase the shift factor. If your environment is less variable, you can decrease the shift. Every process varies and drifts over time. In true Six Sigma parlance they refer to Long-Term Dynamic Mean Variation. Sounds impressive and it is actually simpler than it sounds but for right now understand that Six Sigma often says that this variation typically falls between 1.4 and 1.6. Is that true in all cases? Not really. After a process has been improved to some extent using the Six Sigma DMAIC methodology, you can calculate the process standard deviation and sigma value. These are considered to be short-term values because the data only contains "common cause variation", the idea here being that DMAIC projects and the associated collection of process data occur over a period of months, rather than years. Long-term data, on the other hand, contains common cause variation and special (or assignable) cause variation. Because short-term data does not contain this special cause variation, it will typically be of a higher process capability than the long-term data. The ostensible difference between the two is the 1.5 sigma shift. That is what that means in terms of Six Sigma. But you can also figure out what this is for your own organization. If you are given adequate process data, you can determine the factor most appropriate for your particular process.
Just keep in mind that the reporting convention of Six Sigma requires the process capability to be reported in short-term sigma, without the presence of special cause variation. Long-term sigma is determined by subtracting 1.5 sigma from the short-term sigma calculation to account for the process shift that is known to occur over time. That applies in manufacturing, transactional, services, and software based settings.
Say I have a product going to market and it allows people to search for other people selling cars within a certain radius from their zip code. Anytime the customer is given information that is outside of that zip code, it is invalid and is considered a defect. An opportunity is each instance of an invalid entry. Or think of a person using a new Web site that allows them to search for apartments. They can search for certain amenities. Any search that comes up with a place that does not contain those amenities is a defect. (If I only want furnished apartments, but I get a bunch of unfurnished apartments that makes my search more frustrating.) Each invalid amenity return is an opportunity. The defect is the invalid search return. Each one of these is one more defect. So if one search query turns up six cars outside my zip code or four apartments without my specified amenities, I have six or four defects. (Remember: defects are what the customer considers to be a defect. Each opportunity for a defect to occur is what we are concerned with.) So you measure:
Opportunities: 3,000 searches for apartments each month
Defects: 382 known examples of listings not matching search criteria
So via our previous calculation:
[(3000 - 382) / 3000] * 100 = 87.3%
Your sigma is between 2.60 and 2.70, or 115,000 and 135,000 potential defects per million opportunities. In actuality, your process sigma is 2.64 and your DPMO would be 127,333. But remember that DPMO is just one figure for defect measure. The calculation is:
Defects Per Million Opportunities (DPMO) = [(Total Defects) / (Total Opportunities)] * 1000000
If you want, instead do:
Defects Per Thousand Opportunities (DPTO) = [(Total Defects) / (Total Opportunities)] * 1000
In our case that means our above 127,333 would be 127.3. That means every one thousand opportunities for a defect to occur, one does occur just over 127 times.
Keep in mind that the difference here in software is that usually a "defect" was a specific "bug". In other words, it was not related to opportunities but rather treated as a static measure. With the search example for apartments, the fact that incorrect queries would try to be related back to some fault in the code base that allowed this to occur rather than relating a defect to each possible opportunity for occurrence. That is the shift that Six Sigma wants you to make in your thinking.
Also note: what is described here is just a way to determine your current performance and process capability. That is one component of Six Sigma. Another is to measure variation and deviation from that current performance and process capability. Yet another component is the strategic improvement and business transformation elements of Six Sigma. However, QA practitioners and testers rarely have the level of input required for the latter two but might in the case of the first element.
<font size="2" face="Verdana, Arial, Helvetica">No problems on this from my point of view. Please understand you are greatly helping me here in terms of getting my own thoughts on this in order. Also, lest I inadvertantly misrepresent myself, I am a recovering developer struggling in the metamorphosis to a QA person. My QA experience is more academic than practical at this point. I appreciate everyone's patience and willingness to educate and debate.
Re: [TOOLBOX] Six Sigma: Valid for Software?
<font size="2" face="Verdana, Arial, Helvetica">Right - and that is contextual. The choice I make will depend on the measure that I want. You had originally asked: what is an opportunity? Then you had said:
Originally posted by lemboja: By using “arbitrary”, I am trying to state that there are no fixed rules on selecting the metric, but the outcome depends on your choice. Again, if you chose “pages” vs. “characters” in your earlier example, there are differences in the results.
<font size="2" face="Verdana, Arial, Helvetica">Remember that "opportunities" are the total number of chances per unit to have a defect. An opportunity must directly relate to a given customer CTQ (or general requirement). I would choose "words" if my measure was for mispellings because a whole word is something that can be mispelled, whereas a "character" is a single unit. (Even the misuse of a single character in a sentence is best considered at the sentence level because it is only then that the error becomes obvious. Example: "A am happy with this" should be "I am happy with this." But that single character error, "A" for "I", is only meaningful in the context of the sentence as a whole.) Likewise, I would not have chosen "page" as my unit because my goal was to look for incorrect statements about the application in the user documentation. As such, the most granular level I would want are complete sentences or, at most, whole paragraphs. (Not pages because multiple paragraphs are in a page and they can be wrong in different ways and I want to maximize the number of opportunities that I consider when I am looking for when things can go wrong.)
In the example noted above, if we chose "pages" instead of lines, our analysis comes out "worse"; if we use "characters", the stats come out more in our favor.
<font size="2" face="Verdana, Arial, Helvetica">Well, you want to obtain a sample that exhibits characteristics similar to those possessed by the population from which it came, the population about which you wish to make inferences and thus predictions. One way to satisfy this requirement is to select the sample in such a way that every different sample of size n has an equal probability of being selected. This is random sampling. Since it is the case that statistics vary from sample to sample, any inferences based on them will necessarily be subject to some uncertainty. How, then, do you judge the reliability of a sample statistic as a tool in making an inference about the corresponding population parameter? Well, the uncertainty of a statistic generally has characteristic properties that are known to us, and that are reflected in its sampling distribution. Knowledge of the sampling distribution of a particular statistic provides us with information about its performance over the long run. That is exactly what Six Sigma is offering as well. (The key here is representativeness; a random sample by itself can be useless for predictive purposes, such as if it contains outliers or runs.) ...but do understand that a random sample of widgets can be selected to provide a statistically appropriately and predictive model of the remaining widgets.
<font size="2" face="Verdana, Arial, Helvetica">But they do not have to necessarily follow this curve in their discovery. First off, many would disagree with you about the standard distribution curve for defect discovery in software, such as those who look at Rayleigh curves for example or you can look at the various defect prediction models that are out there. (That said, I think many are severely lacking.) Secondly, with a random sample, you want a sub-collection of a population such that every population member is equally likely to be represented. You can do that with defects if you consider occurrences or defect types, such as data errors. You can also consider cumulative error/defect distribution curves, but that is more getting into reliability models and that would take us too far abroad of our discussion. I do not believe that the same can be said for software bugs: they do not follow a standard distribution curve in their discovery.
<font size="2" face="Verdana, Arial, Helvetica">But I could consider both. The functionality of the product is based upon how it is designed and how it is designed is part of the process. I have been thinking in terms of the software development process, whereas your examples (so far) have been studying the functionality of a software product.
<font size="2" face="Verdana, Arial, Helvetica">Well, that is sort of like asking "how can testing help me make a better product?" Answer: it cannot. What it can do is point out deficiencies in a more or less opportune fashion and perhaps suggest areas that need improvement. Likewise with Six Sigma. The basis of it is Statistical Process Control. This involves the measurement and evaluation of variation in a process and the efforts made to limit or control such variation. The idea here is to control the performance of a process, and this is done by identifying possible problems or unusual incidents so that action can be taken to resolve them. "Control" here means keeping a process operating within a predictable range of variation. Here the emphasis is more on "statistical control" and this means you have to measure a process over time and then examine the variation in the data that you have gathered. With enough data you can gather what are called control limits. (Control limits, in fact, are calculated from actual process data. They can change as the process changes over time. Specficiation limits, on the other hand, come from the customer. They can change but only as the customer's requirments change.) So…let’s say I’m the software development manager for a small (6-10 developers) custom software house, with a 50% annual turnover rate. How can Six Sigma help me deliver a higher quality product?
A process's behavior is only predictable if the process is stable or "under control". Statistical methods can help us evaluate whether an underlying process is under control. We can calculate upper control limits and lower control limits. If a process stays within limits and does not exhibit other indications of lack of control, we assume that is is a controlled process (within the tolerance bounds/limits we established). This implies that we can use its past performance to predict its future performance within these limits and can determine its capability relative to some sort of customer specification.
So you said you have a fifty percent annual turnover rate. However that tells us nothing. We have no idea of what impact that has on your process. So what Six Sigma can do is help you set up control limits and then look at the variation you have in your process at various points. Those "various points" are when you take measurements. But, again, the example scenario you give is sort of like presenting that same scenario to someone and instead of asking "How can Six Sigma help me deliver a higher quality product?", you might ask: "How can testing help me deliver a higher quality product?" Without more consideration into the context, the question is irrelevant or, at least, poorly formed.
You have to define the process you want to consider. Do I want to consider the process by which we track bugs? Do I want to consider how effective we are at meeting deadlines? Do I want to consider how effective we are at meeting our profit goals? Do I want to consider how effective our processes are for gathering and eliciting customer requirements? Do I want to consider the process by which we produce builds? The goal, remember, is to look for variation in your processes. Your goal is also to consider the impact of that variation on your processes. Can methods other than Six Sigma do this? Certainly and they can do it very well. My point here is that Six Sigma is just another such method and it can be utilized in software. And I base that not only on theoretical considerations of Six Sigma but also empirical observations based upon practical experience that I have had with implementing aspects of Six Sigma in a software-based organization.
<font size="2" face="Verdana, Arial, Helvetica">Why is it impractical though? I only have your general statement and I have no data points on which you are basing it. Therefore it is hard for me to respond. As I had mentioned, I have used Six Sigma in software organizations and we could reach "Six Sigma levels" based upon how we agreed to define units, defects, and opportunities. The only way I can think to proceed with this statement is for you to answer your own declaration operationally: There is a place in the software development process for statistical analysis, but attempting to define and measure the data necessary to reach a definitive six sigma level is impractical.
(1) Why is defining the data necessary to reach a Six Sigma level impractical?
(2) Why is measuring the data necessary to reach a definitive Six Sigma level impractical?
Re: [TOOLBOX] Six Sigma: Valid for Software?
(Sorry, had actual work to do for a while...)
Let's close a few points:
1) We probably need another thread to discuss "arbitrary" vs. "contextual". I am among those who believe that everything is contextual (or at least should be), so it is a given to me that one needs to understand one’s environment and adapt accordingly. By using “arbitrary”, I am trying to state that there are no fixed rules on selecting the metric, but the outcome depends on your choice. Again, if you chose “pages” vs. “characters” in your earlier example, there are differences in the results.
2) Probabilities of probabilities. I am no statistician, but do understand that a random sample of widgets can be selected to provide a statistically appropriately and predictive model of the remaining widgets. I do not believe that the same can be said for software bugs: they do not follow a standard distribution curve in their discovery. If you can’t provide valid numbers as input, how can you trust the output. Garbage in, garbage out.
The last point highlights where we diverge the most. I have been thinking in terms of the software development process, whereas your examples (so far) have been studying the functionality of a software product. So…let’s say I’m the software development manager for a small (6-10 developers) custom software house, with a 50% annual turnover rate. How can Six Sigma help me deliver a higher quality product?
I’ve been nudged, but I’m not convinced. I will modify my underlying argument to: There is a place in the software development process for statistical analysis, but attempting to define and measure the data necessary to reach a definitive six sigma level is impractical.
With that said, there are plenty of things I agree with in this thread . . . but I’ve run out of time to explain.
Re: [TOOLBOX] Six Sigma: Valid for Software?
Not for nothing, I see a lot of reads, but no one else is tossing in any comments. Any other thoughts out there?
Re: [TOOLBOX] Six Sigma: Valid for Software?
Thought you all might be interested in this article:
"Six Sigma? No Thanks."
By Scott Dalgleish
Re: [TOOLBOX] Six Sigma: Valid for Software?
<font size="2" face="Verdana, Arial, Helvetica">Thanks for bringing up the article Tim. It was actually very interesting to read. Unfortunately, for me, the article content was largely lacking. The author introduces declarative statements that he does not back up. To wit,
Originally posted by testgeek: Thought you all might be interested in this article:
"While I believe in Six Sigma’s content, I will also add that as a trend, it has been harmful to our profession."
"This is the case with Six Sigma, as with all previous repackaged quality programs."
"Six Sigma is a trend that has had an especially negative effect on the quality profession."
Wonderfully declarative statements - but not backed up at all with any data points, thus making responses a waste of time as far as I am concerned. (It sounds like he has more trouble with the ASQ and how it perceived Six Sigma rather than Six Sigma itself but he seems to fail to make that distinction.)
The author also says, at one point: "Six Sigma doesn’t help me. I have a good quality culture at my company and a team of managers that use quality theory in all their decisions." My response would be: Fine. Do not use it. Six Sigma is just one tool in a large toolbox. Many proponents of Six Sigma would definitely say that if you do not need it, do not use it.
With this said, I agree with him on one point: "If this is the trend that gets you interested in quality, then you must focus on the content, not on the label." I agree and that is what I have tried to show here in this thread. Six Sigma can be utilized in different fashions and those elements that you use have probably been around for awhile in another guise. So do not get too hung up on the label "Six Sigma" and automatically say that this label can only be applied to manufacturing. Rather look at Six Sigma as a tool and see if it is workable in your organization, even if only in part. I will also say that I agree with the author that concepts like Six Sigma (or CMM or TMM or TQM or PSP) can all become dangerous if they become non-reflective trends in the industry. As the author states, he likes the content but not how the content has been utilized (such as by some consulting companies). I am certainly okay with that sort of opinion but then his vehemence against using Six Sigma, even in part, seems to again reflect that lack of distinction between how some proponents viewed Six Sigma and the actual efficacy of Six Sigma itself. I would rather debate something on its merits rather than on the actions of some (perhaps misguided) practitioners.