We have connectivity problem when trying to execute more that 10 distributed test. Randomly “unable to connect to remote machine” error appear. In this case you need to restart machines in order to continue, which is unacceptable. We need to run 100+ test cases in distributed environment without interruption. Since our cases are complicated we develop simplest test that reproduce problem
Here is a setup.
1 master machine with test complete. 3 slave machines with test execute and icq. All with windows XP sp1.
Test is simple. Run ICQ on each machine and close it after few seconds. Run 100 such cases.
Time to time connectivity issue appear – randomly. And we need to reboot machines.
Problem appears on version 3.05, 3.06, 3.07 and 3.08. We have send dump files to Automated QA but till now there is no solution for our problem. We understand that it is hard to reproduce but the problem is a showstopper for us.
Has anybody experienced such problem and have solution? Is there a workaround? Can we force TC to dump some readable information in order to track the problem?
Have you tried instead of rebooting killing and re-running tcrea.exe on the slave machines? That's only a workaround, but still is faster than rebooting. Although for me the problem seemed to disappear after installing TC 3.06
Thanks to the joint effort with AutomatedQA “Distributed problem” was resolved. Core of the problem is Pentium IV Hyper Treading option. When it is switched off the problem disappear. Other way is to configure Process Affinity setting for tcrea.exe.