stressTests
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
# Copyright: (C) 2010 RobotCub Consortium # Author: Lorenzo Natale # CopyPolicy: Released under the terms of the LGPLv2.1 or later, see LGPL.TXT Lorenzo Natale's bug hunting: ==================================================================== In $YARP_ROOT/example/stressTests you find some code + a couple of scripts that produce some undesired behavior. It is basically a remote_controlboard accessed concurrently from a number of controlboard (stressrpc.cpp). Code and scripts are easy, so I don't go too much in the details, I just summarize here the behavior I see: ./stress.sh works fine ./stress2.sh works fine, but at the end files to quit yarpdev, just have a look at the output of ps (this is the undefined behavior I see) ./stress3.sh works fine ==================================================================== Lorenzo: I have uploaded a stressrpcMD.cpp (MD is for motion done). If you run the test_motor device like this: yarpdev --device controlboard --subdevice test_motor --delay 80 and then two instances of stressrpcMD.cpp as: console1: stressrpcMD --id 0 console 2 stressrpcMD --id 1 you *should* hopefully see that sometimes one of the two (the second?) hangs within the "checkMotionDone" call (but sometimes even in the port "open" call). The problem is more frequent if you kill and restart one of the two clients. When this happens the test_motor device does not close properly unless the clients are killed (similarly to the previous bugs I have recently reported) I believe the problem has to do with timing. In fact it took me a while to replicate it in the test_motor device. The problem on the robot was triggered very reliably in the getPid function which took 70-90ms to complete. For this reason I have added a Time::delay() in the checkMotionDone function of the test_motor device (this delay can be changed with the --delay parameter). Caveats: the problem happens often but with variable likelihood, the --delay 80 appeared to trigger it more often, but I'm no longer sure. Maybe adding the sleep was enough to make the problem more probable and I got fooled into thinking the "80" number was more important than it is... I don't know. Anyway let's first see if you can reproduce the problem... it might be machine dependent. ==================================================================== Paul: I've added a stress test that doesn't use the controlboard: "smallrpc" smallrpc --server smallrpc --client --name /client0 smallrpc --client --name /client1 smallrpc --client --name /client2