| || |
CORBA error \"OBJECT_NOT_EXIST\"
I'm testing a C/S app (CORBA, C++) with a few hundred data driven testcases. For the last few months all test components (server, client, SilkTest&Agent) ran on the same machine, but now we went on to installing the server on a remote machine, which itself is no problem at all, especially in terms of SilkTest.
But: Since we did that, we randomly come across the phenomenon that single testcases crash with the appearance of a CORBA error saying "OBJECT_NOT_EXIST", which normally means that a server is down. This isn't the case here - the testcase after the one that crashed runs without problems. And: If I repeat the crashing testcase afterwards, it runs perfectly too, so that I am in a situation of complete arbitrariness & little (almost no) reproducibility.
Now we're wondering what causes this behavior: Is the SilkTest script too fast? (I wouldn't want to insert something like a general "sleep(1)" between testcases since this is no solution really.) If so, can I check by means of the script that a server exists but is "not ready" (or whatever)? Or is there something completely different going on?
I am no SW developper, but I have to discuss this topic with some Borland guys and now I'm looking for hints & suggestions. Anyone ever came across something like that?
Thanks a lot in advance,
Re: CORBA error \"OBJECT_NOT_EXIST\"
I haven't worked directly with CORBA but I don't think that matters here.
What you've likely run into are timing issues that have been exposed by moving the client and server code to different machines. When they were together they were all under the same constraint of being multi-tasked through the same processor. Now that they are apart external interactions, such as other LAN traffic, can come into play to trigger problems. Previous assumptions and expectations are not accommodating the larger environment.
You are right in your assessment that the key to fixing these kinds of problems is NOT to blindly introduce time delays. As a rule time delay insertions rarely resolve a problem for long because they often reoccur as the test environment changes. They also unnecessarily increase full regression test run times - sometimes dramatically.
To make the problem go away permanently you need to consider introducing retry and handshake logic so that occasional timing hick-ups are handled in stride. Handshake logic means:
1. waiting for something to become available (such as a link-sensitive menu item to become enabled),
2. performing a desired action (picking the menu),
3. then verifying that the expected action occurred (a dialog appeared) - all within time constraints that you define.
All three of these steps - wait, do it, check it - can include a time element.
Retry logic includes the setting of an upper limit time value and a retry increment. The upper limit ensures that testing will not "wait forever" for something to happen; it'll instead report an error and call recovery code. Setting a retry increment ensures that the tests run as quickly and efficiently as possible.
An example: For a particular action you might be willing to wait up to 5 seconds before reporting an error using a "check it again" increment of 1/100th of a second. Worst case: you'd wait the full 5 seconds and report an error. Best case: you wouldn't even wait the 1/100th of a second before moving on to the next action.
Re: CORBA error \"OBJECT_NOT_EXIST\"
<BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>Originally posted by John J. Miller:
You are right in your assessment that the key to fixing these kinds of problems is NOT to blindly introduce time delays.
Thanks for all the information, John. I'm especially glad to hear that you share my opinion about not using general sleep() commands. And of course your suggestions about how to handle these timing issues properly to me seem to be the only reasonable way.
<BLOCKQUOTE><font size="1" face="Verdana, Arial, Helvetica">quote:</font><HR>
1. waiting for something to become available (such as a link-sensitive menu item to become enabled)
As long as this "something" is a GUI object (like an error message saying that a server is not available), SilkTest will do fine.