-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
run bpipe in parallel #58
Comments
From [email protected] on 2013-09-28T22:27:34Z I have seen the same error; this appears to be due to status UNKNOWN being returned from the polling. .bpipe/logs/2832.bpipe.log (excerpt): .bpipe/logs/2832.log (excerpt):| Starting Pipeline at 2013-09-28 23:36 |=========================================== Stage hello ============================================ |
From [email protected] on 2013-09-29T04:15:08Z Thanks for posting the stack trace. I've fixed the problem that occurred there. I'm not sure if it will address this issue or not, as the error is in the handling of the UNKNOWN status, so the fix will not be preventing UNKNOWN status from occurring. However the intent of the code is to retry in the face of UNKNOWN being returned, so it may behave better now if it retries. I will be releasing a new version soon with this fix in it. |
From [email protected] on 2013-09-29T10:12:46Z In our case the jobs are removed from the cluster after completion, so qstat won't catch them (which in the bpipe_torque.sh script triggers the UNKNOWN status). I can check on alternatives with our cluster admins. We have a similar issue with submission of jobs, in that if we run 'bpipe run' as a job (e.g. from a worker node) our worker nodes are not configured to allow job submission. We have a script to get around this but it requires modifying the bpipe_torque.sh script. I'm wondering whether there is a way to make the two (job submission and job polling) more generic or flexible? |
From [email protected] on 2012-08-20T12:09:09Z
I tried to run bpipe in the cluster with qsub support. And I tested on a simple pipe but got the error message:
...
Pipeline failed!
Job runner bpipe.TorqueCommandExecutor failed to return a job id despite reporting success exit code for command:
bash /mnt/Home/zhuw/prj/dn/bpipe/bpipe-0.9.5.3/bin/../bin/bpipe-torque.sh start
Raw output was:[
]
...
Could any of you provide a good example with the config setting to make a test.
Thanks,
Wei
Original issue: http://code.google.com/p/bpipe/issues/detail?id=58
The text was updated successfully, but these errors were encountered: