Replies: 1 comment 23 replies
-
Is the issue that messages like "dispatched branch data_object_dbbcba117fd6da13" and "dispatched branch data_object_7fff03fa2ec02af0" creep along and are slow to print to your R console, or that the underlying work is slow to complete? In the former case, profiling may give us a better idea of what exactly is slowing down execution: https://books.ropensci.org/targets/performance.html#profiling. In the latter case, it could be that your SLURM cluster is busy and all your jobs are waiting in a queue, and monitoring can tell you if your jobs are actually running: https://wlandau.github.io/crew.cluster/index.html#monitoring |
Beta Was this translation helpful? Give feedback.
23 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Help
Description
I have approximately 24k files. Below is a simplified version of the pipeline: It reads the file names from the RDS, loads them dynamically, and then branches over to other targets in subsequent steps. Initially, I processed the pipeline in batches of 2,500 samples each (For instance, firstly RDS has 2500 samples, then increases to 5000, etc). However, during the second batch, dispatching data_object branches took an unusually long time—a process that typically takes just a few seconds. I’m not sure if this issue is related to how crew.cluster manages large increments of targets, or if there is a limit on the number of targets that Targets can dispatch.
This screenshot shows that many branches have been dispatched, but none have completed yet.
Beta Was this translation helpful? Give feedback.
All reactions