You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two closely-related issues here. I started using plugin :solid_queue in my service to have puma manage the Solid Queue worker process, and it's broken phased restarts, causing an outage on every deployment because I have a single vps serving my site. This seems like a design issue contrary to project goals of enabling straightforward single-server use.
We do a phased restarts whenever possible (deploys that don't touch the bundle) so puma replaces workers one-at-a-time, so we have a zero-downtime deploy.
The first issue is that the plugin crashes on start because it assumes the Rails app is preloaded. I found a workaround (our commit) that might be working, but I suspect it breaks phased restarts in a way that's currently masked by the second issue.
The second issue is that the plugin deliberately crashes puma during phased restarts. In the logs, I see:
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Starting phased worker restart, phase: 1
May 15 15:31:53 lobste.rs puma[282083]: [282083] + Changing to /srv/lobste.rs/http
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Stopping 282297 for phased upgrade...
May 15 15:31:53 lobste.rs puma[282083]: [282083] - TERM sent to 282297...
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Stopping 282297 for phased upgrade...
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Stopping 282297 for phased upgrade...
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Stopping 282297 for phased upgrade...
May 15 15:31:53 lobste.rs puma[282083]: [282083] - Stopping 282297 for phased upgrade...
May 15 15:31:55 lobste.rs puma[282083]: [282083] Detected Solid Queue has gone away, stopping Puma...
May 15 15:31:55 lobste.rs puma[282083]: [282083] - Gracefully shutting down workers...
May 15 15:32:24 lobste.rs puma[282083]: [282083] === puma shutdown: 2025-05-15 15:32:24 +0000 ===
Solid Queue doesn't seem to understand a phased restart and is inappropriately halting the puma supervisor.
Then a few seconds later systemd notices that the puma service has crashed and cold starts it. It's 30+ seconds for one worker to start and the nginx queue has filled up, so we throw a lot of 502s and limp back into normal service while the workers get hammered.
The text was updated successfully, but these errors were encountered:
Two closely-related issues here. I started using
plugin :solid_queue
in my service to have puma manage the Solid Queue worker process, and it's broken phased restarts, causing an outage on every deployment because I have a single vps serving my site. This seems like a design issue contrary to project goals of enabling straightforward single-server use.We do a phased restarts whenever possible (deploys that don't touch the bundle) so puma replaces workers one-at-a-time, so we have a zero-downtime deploy.
The first issue is that the plugin crashes on start because it assumes the Rails app is preloaded. I found a workaround (our commit) that might be working, but I suspect it breaks phased restarts in a way that's currently masked by the second issue.
The second issue is that the plugin deliberately crashes puma during phased restarts. In the logs, I see:
Solid Queue doesn't seem to understand a phased restart and is inappropriately halting the puma supervisor.
Then a few seconds later systemd notices that the puma service has crashed and cold starts it. It's 30+ seconds for one worker to start and the nginx queue has filled up, so we throw a lot of 502s and limp back into normal service while the workers get hammered.
The text was updated successfully, but these errors were encountered: