You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've recently been using the MultiSyncDataCollector and encountered this error (raised here in collectors.py):
RuntimeError: This process waited for 1000.0 seconds without receiving a command from main. Consider increasing the maximum idle count if this is expected via the environment variable MAX_IDLE_COUNT (current value is 1000).
In our case, this error is actually not an error. We're doing some other tasks in between collecting data, which ends up triggering this.
The error message itself points to the solution (setting MAX_IDLE_COUNT). However, we feel like this is not a very ergonomic interface.
To the questions:
Out of curiosity, what is the motivation behind this error?
Could this be potentially replaced by a warning instead of an error?
Would you be open for changing this interface? For instance, passing the timeout value as an argument + optionally disabling it altogether.
The text was updated successfully, but these errors were encountered:
Out of curiosity, what is the motivation behind this error?
In the early days of the lib we were very concerned about an env erroring and having a worker mysteriously hanging, this was intented as a safeguard (mainly for the tests I must admit!)
Now that more and more of us are doing agentic stuff where the time it takes to get a batch out of a collector is nondeterministic, this makes less and less sense.
I'd be open to put that time to infinity by default and a short span for the tests, that way we get to quickly kill stalling processes but users won't be impacted as much.
(in the meantime you can do MAX_IDLE_COUNT=A_SUPER_BIG_NUMBER python myscript.py)
Hello,
We've recently been using the MultiSyncDataCollector and encountered this error (raised here in
collectors.py
):In our case, this error is actually not an error. We're doing some other tasks in between collecting data, which ends up triggering this.
The error message itself points to the solution (setting
MAX_IDLE_COUNT
). However, we feel like this is not a very ergonomic interface.To the questions:
The text was updated successfully, but these errors were encountered: