Thanks:  0
Likes:  0
Dislikes:  0

# Thread: Same test different results - How to Handle?

1. ## Same test different results - How to Handle?

Trying to advise a colleague who has an interesting problem in trying to plan some testing for an application he is responsible for. Becasue of the mathematics used in the solution it is perfectly normal that if you run the same test twice inputing the same data you get two different results. Does anyone else have any advise in testing this type of application?

The application in question is a planning tool that generates storeage plans for thousands of items stacked together in a confined space where the expected removal of each item is known. The application plans the storage to rduce the amount of stacking/unstacking required. He is trying to get from the business some deficition of what a "correct" or "ok" solution would be.

2. ## Re: Same test different results - How to Handle?

Lol, ouch! Well first, my question would be, why are there mathematics in the application which return different answers for the same question? What is the logic behind this? Even if we're talking about maximizing the efficiency of stacking and unstacking, there is obviously going to be one, single, solution which is optimal. So, if I was a human looking at a problem then would I come up with multiple solutions for this same problem? I just don't get how something can spit out multiple answers for the same question.

So I guess my advice in this case would be to understand the mathematics at the most basic level. Ask questions!!! "Why?" is a powerful question of logic (ask any 3 year old).

Honestly, in my opinion, there should be NO different answer for the same mathematical question. If there are multiple or equivalent options then why doesn't the application just give me both, or all, options and let me choose which I think will work the best for me?

3. ## Re: Same test different results - How to Handle?

It's very easy for a program to return different results each time it's run. There can be more than one optimal solution, but more likely they're look for a solution that is "good enough" to reduce computation time. If the algorithm involves a random search, you can get different results each time. And this algorithm is perfectly acceptable.

As for how you test it...uh, heh. The only thing I can suggest is to take the results and then simulate stacking/unstacking and see if it's "excessive". If the program really is shooting for "good enough", then the best you can do is look at it and subjectively say, "It's good enough." Maybe run it a few times and see if you end up with a limited set of results. It might not return the same thing every time, but maybe it will return, for example, one of five results.

Other than that, maybe a review of the algorithm to look for holes and possible optimizations?

4. ## Re: Same test different results - How to Handle?

I am certainly glad it's your colleague that has to test this thing and not me! Sounds like an excellent place for automation, then run it 100 times as JHollen suggested, then correlate the results. It seems that there can't be too many different results that are all optimum doesn't it? And what happens if 1 out of 100 times there is a result that is not acceptable? You really need some more input from the designer and the expectations of the customer. I would also advise a precisely written discloser of the limits of ability to test all permutations and the risk involved.

I wish you colleague good luck!

5. ## Re: Same test different results - How to Handle?

Are you testing this?
http://en.wikipedia.org/wiki/Technology_...obability_Drive

I can accept that there are multiple answers, but if there are multiple answers, I would expect that the system would at least give me the option of selecting the best for me. Otherwise, it is, very much, like the Infinite Improbability Drive. Sometimes something is very close to what you want, but other times it's very far away from what you want.

My question would be, can you even judge which is good and which isn't good manually? Are there instances where certain setups would make sense and others in which it wouldn't?

6. ## Re: Same test different results - How to Handle?

Not quite the improbability drive, but apparently the maths behind the solution is very complex so it may only be a few steps away.

My colleague is trying to trying to work with the cusotmer to quantify with them what a good test result is. At the moment it requries an expert to really say if the result is good or not and we wan to reduce the expert time requried to verify. Even if we automated the tests to get multiple results the need to verify each one against a set of criteria still exists and at the moment is not very well descibed.

7. ## Re: Same test different results - How to Handle?

Guesswork:
Is the system (a) calculating all possible outcomes, giving them a score (for illustation purposes out of 10, with 10 as 'best') and then selecting a random result that scores 10 out of all the possibilities that also score 10? Or (b) Does it try to solve in multiple threads and stop when any result scores a 10?
It shouldn't be very difficult to make the first kind of system deterministic.
The second kind is more difficult (and let's assume you want the problem solves as fast as possible, hence you are threading and stopping at the first "best" answer), and assuming it is always going to non-deterministic then you are already asking the right questions for the customer:

Is an okay result that it works and comes up with a workable solution? (still have to first find and then check the many results)
Or
Is an okay result only the optimal solution? - in which case you would need a way of deciding which of the multiple outputs are 'better' or if they really are equivalent. (good luck)

Me? I'd try to ensure that the application was made deterministic as it would make testing (and hence the risk management) far easier.

Of course, there may be an option (c) I'm, missing, as I say, all the above is based on speculation and assumptions, which means it's probably useless [img]/images/graemlins/wink.gif[/img]

8. ## Re: Same test different results - How to Handle?

It's not that unusual to have a situation where the same input produces different output.

[ QUOTE ]
He is trying to get from the business some deficition of what a "correct" or "ok" solution would be.

[/ QUOTE ]This, along with working with the client, and the "experts" makes sense.

9. ## Re: Same test different results - How to Handle?

[ QUOTE ]
Is an okay result that it works and comes up with a workable solution? (still have to first find and then check the many results)
Or
Is an okay result only the optimal solution? - in which case you would need a way of deciding which of the multiple outputs are 'better' or if they really are equivalent.

[/ QUOTE ]
It is the former of the two. I just listened in to a tele confernce between one of the expert users and my colleague and here is how they are now going to approach it.

There are a number of factors that can easily be measured for any result. Each on their own does not necessarily indicate a good solution or not (although there are a few that directly indicate a bad solution). The users are going to provide samples of solution outputs and how good they consider them to be. We will then look at the measures to see if we can find a correlation and determine some combined charactics that would determine if the result is good or not. Although we may still need to get a final opinion from our expert users whether the result is ok or we have a bowl of petunias we should at least have a basis for passing outputs to them for checking.

10. ## Re: Same test different results - How to Handle?

[ QUOTE ]
It's not that unusual to have a situation where the same input produces different output.

[/ QUOTE ]

Sure, because there really is no right answer. I'm sure there is some amount of science or art behind stacking boxes optimally. However, are there any real rules behind it? I think that if the expert was properly qualified then you could still generate a good list of rules from that expert which would produce a single or a few optimal results for each stacking assignment.

However, if you were to talk to a different expert, would there be the same rules? Would there be similarities in the rules? Where are the exceptions and where are the similarities? Lather, rinse, repeat with a few other experts. At best, you've come up with a set of rules that seem to be commonly accepted. At worst, your results are so diluted that everything that every one of the experts said is out the window.

I think that what you might also need to address is what are the common questions that the experts receive? I mean they are probably making decisions based on consumer feedback in realtime. This can be a great influence on the product also.

So, yes, there can be multiple outputs for the same input, but I think that it is only the result of a lack of information and less a matter of math. I would still condend that if there were strict rules about stacking that you might find that there are strong similarities between what all experts do, which would, more than not, translate into a very common, very limited number of possibilities. Without this, we're drawn on personal experiences from every single person who touches the software. So, unfortunately, without running some sort of detailed personality, experience, and background check on the user, I think it can be educated guesswork.

I work in an industry that is highly specialized, also. We DO produce software where we make many assumptions for the client. However, there are, still, at least some strict guidelines which are fairly accepted in the inudstry that afford us some flexibility. With something that is entirely subjective to the person using the software, it might be more difficult. Although it is always possible to simply provide the individual user with a variety of solutions, pros and cons of each, along with information that each expert takes into consideration when performing these calculations themselves.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•
Search Engine Optimisation provided by DragonByte SEO v2.0.36 (Pro) - vBulletin Mods & Addons Copyright © 2016 DragonByte Technologies Ltd.