Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock unique every job #66

Open
jvdgrift opened this issue Nov 23, 2016 · 2 comments
Open

deadlock unique every job #66

jvdgrift opened this issue Nov 23, 2016 · 2 comments

Comments

@jvdgrift
Copy link

I'm currently faced with an deadlock issue on our production environment: a single unique job is started every day at 03:30 AM... that schedule is triggered but all 6 of our workers are reporting a lock error (deadlock) and the job is not run on any of the node servers.

03:30:01.143 Lock error: LockError: Exceeded 0 attempts to lock the resource
03:30:01.132 Lock error: LockError: Exceeded 0 attempts to lock the resource
03:30:01.132 Lock error: LockError: Exceeded 0 attempts to lock the resource
03:30:01.130 Lock error: LockError: Exceeded 0 attempts to lock the resource
03:30:01.089 Lock error: LockError: Exceeded 0 attempts to lock the resource
03:30:01.087 Lock error: LockError: Exceeded 0 attempts to lock the resource

I also switched to the kue-scheduler master (with the retry count of 3) but same result.

We use ioredis and redis with sentinels on production...

Running the code @localhost with 6 node instances and 1 redis (no sentinels) starts 1 unique job (no deadlock).

Any idea's what could be wrong?

The job once was non-unique and we needed that fixed... but now we have this issue of the deadlock. Could it be that the information stored in redis is mixed up? What needs cleaning up?

Side note: when I tested localhost with 6 node processes and the job executed took a very small time (< then the acquire lock time... It just logged a statement and returned) then the unique job got executed 4 times instead of once. The lock got released and other acquire locks still ran I suppose
and got hold of a lock and also executed the job.

@jvdgrift
Copy link
Author

Oke... the value of the unique of the job in question was pointing to a non-existing job. removing the unique solved it... I think we removed the completed job in kue-ui and that probably doesn't clean up the unique of kue-scheduler? perhaps the code could check this and if unique is non-exisiting to just create a new job?

@lykmapipo
Copy link
Owner

@jvdgrift I will appreciate a pull request to allow the checkup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants