-
-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task issues #71
Comments
Yeah there's still possibly an issue creating tasks with signals which I can't identify as it seems intermittent and I've never seen it myself in testing or my own (admittedly smaller) deployments. The "reset tasks" button can only partially work due to the timeouts in the web interface, the "reset tasks" has to trigger every signal again in a loop, so basically a Can you try the reset tasks CLI script that's in https://github.com/meeb/tubesync/blob/main/docs/reset-tasks.md |
That's why I made the log request :). I've been trying to troubleshoot and since I've only had a small portion of my list added, I've wiped the DB and started fresh twice but keep encountering the same issue. I've been adding smaller channels this time around instead of the 1000+ vid channels and just hit the task problem again. Ran the CLI command and ran into an error:
It hangs for quite awhile, then throws I attempted the CLI command again, received the same as above except I'm still waiting for execution completion. It appears completely frozen again with Edit: after 5 minutes of hanging with only 5 more reports of |
Ah, that's a db write occurring in the container and then via the CLI script at the same time which SQLite can't handle. Try stopping the container with docker run \
-ti \
-v /path/to/your/config:/config \
-e PUID=1000 \
-e GPID=1000 \
--entrypoint=/usr/bin/python3 \
ghcr.io/meeb/tubesync:latest \
/app/manage.py reset-tasks Edit as required. That will run the You just need |
This should probably be in the advanced docs as well thinking about it. |
How long should it run? I've not gotten any stdout messages since |
It's not indexing or doing any requests work, it's just looping over all the media items one by one and recreating all the expected tasks so if you have a very large number of media items it might take quite a while. It's safe to just leave running until it finishes with a "Done" though. It takes about 30 seconds with my 5000 test media items so I guess you have quite a lot more than that. |
https://github.com/meeb/tubesync/blob/main/tubesync/sync/management/commands/reset-tasks.py ... if you're curious as to the steps. It's slow because to trigger the required signals to create tasks you have to |
All together took 36 minutes to execute. I currently have 3318 media items indexed and 1002 tasks were created. Can't check anything further from the task page because |
Seems there's still some workers tripping over each other?
It then recovers and moves on to the next channel but keeps tossing the DB errors every 10 tasks or so |
Is your |
I had upped workers to 4 on a previous DB and had issues, but this iteration is purely default. No NFS/SMB. Just a SSD passed-through to a VM with a directory passed to Docker. I had read that issue/enhancement previously and I guess I missed the |
Yes. that "Failed to retrieve tasks. Database unreachable.," is thrown by https://github.com/arteria/django-background-tasks/blob/master/background_task/tasks.py#L254 Since the change to flat indexing which is much faster you really probably don't need more than one worker, so setting them to 1 is probably fine. |
After some testing with 50k media items generated SQLite can't manage the concurrent writes required properly. This could be mitigated with very careful thread management, but it's probably easier to just use the right tool for the job. Will work on it in #72 - I would guess your very large deployment will continue to be twitchy until that's complete unfortunately. |
Gotcha, I was wondering about that myself. I'm nowhere near 50k, still at 3400 media items being tracked over 20 sources and hitting the performance described above. Just wait until I add TED with 100k+ videos itself |
it seems from my testing once i get into the 2kish media items it starts hitting this hard |
I have 5-6k media items in my personal install with a single worker and SQLite with no issues what so ever, but from the issues I can see it's common to find errors in some deployments as well. I'm not entirely sure how people are finding these database locking issues (assuming their SQLite databases aren't on network shares etc.) given I can't trigger them with a single worker even when trying to find them, but generally I expect the advice in the future is just going to be "got database locking issues? migrate to postgres" or similar. It'll be fixed before TubeSync comes out of pre-release at any rate. Thanks for being the early testers. |
@meeb This is a really great app which is why i keep banging on it. Sure I do all the same stuff using scripts but thats not very high in the gf acceptance factor and just ease of use. |
Thanks! Feel free to keep mentioning issues if you find them that's what the pre-release is for, providing you're fine with it being occasionally broken until v1.0. |
Im totally ok with that thats why its alpha |
Thanks for the anecdotal confirmation of the issue. I suspect this is going to fix itself once a bunch of other issues get worked on (like #93 and #72 etc.) . Obviously setting up TubeSync somewhere with 50 large channel sources and waiting until something breaks isn't that efficient a method of testing so I've still not directly seen this myself yet :) |
That's what I'm here for haha! |
Confirming that this is happening for me as well. I have 767 tasks scheduled to run immediately, and no running tasks. I have tried changing various worker values, 1, 2, and 4. But it doesn't seem to have changed anything. I have 34 sources, and am running on docker on unraid. Thanks for building this, it seems super useful! |
Wanted to add that I've experienced this a few times as well, running on a deployment that had both an external DB server (MariaDB) and TUBESYNC_WORKERS=1 set up before ever adding any media. As discussed earlier, I think each time I actually triggered the issue by trying to use the Reset Tasks button in the GUI. It seems like after the timeout, some sources get into the broken state where they no longer even generate a task to index, etc. I've been able to resolve it each time by getting onto a local terminal session in the container and running Since it seems like there's quite a few of us running TubeSync with large amounts of source/media, and task-related issues seem to be the biggest problem I've had, these are my suggestions based on my experience:
Hopefully in the future these tweaks won't be necessary, but I thought I'd share my experience for anyone playing with the current builds using lots of sources with a lot of media. |
Thanks for the report and testing. I'm fiddling with a full migration of the background tasks system to Celery but it's a big change and taking a while, primarily around making sure I don't break everyone's installs with a botched automated data migration. That Postgres issue should be solved sooner though! There is, alas, very little that can be done about quite high memory usage during indexing large channels as this does cause upstream libraries the worker uses to store a lot of information in memory, however the workers do free this memory after indexing when using the new workers so that should improve to burst usage not permanent usage eventually. |
Did you manage to run the 'reset-tasks' cmd (CLI) on unRAID? Just asking, because I'm not really sure how to call it from the container without using the webGUI button (reset tasks). |
Currently running
latest
, but had similar issues inv0.9
As mentioned elsewhere, I've got a massive list of channels to add and tasks keep seeming to freeze for me. Seems to be two different issues.
The 2 playlists were queried but when I added the channel, it is again listed as "running" with nothing being executed.
The text was updated successfully, but these errors were encountered: