Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shutdown (without persist) hangs in Github actions #9

Open
nihohit opened this issue Jul 9, 2024 · 21 comments
Open

Shutdown (without persist) hangs in Github actions #9

nihohit opened this issue Jul 9, 2024 · 21 comments

Comments

@nihohit
Copy link

nihohit commented Jul 9, 2024

I can't repro this locally, so I don't know what the source of the issue is. The temporary DB, created with PgTempDB::async_new().await (so persist isn't set), hangs for minutes after a call to shutdown. Maybe there are some existing connections to the DB, but according to the documentation, this shouldn't block shutdown when persist isn't set.

This happens on a runner running Ubuntu noble, with postgresql-16 installed.

@boustrophedon
Copy link
Owner

Could you post a code sample? I briefly looked at the docs and it says "SIGKILL kills the postgres process without letting it relay the signal to its subprocesses, so it might be necessary to kill the individual subprocesses by hand as well." which I didn't know - maybe in my testing I only ever tried to kill it before doing work and the subprocesses were never spawned.

@nihohit
Copy link
Author

nihohit commented Jul 10, 2024

This sample completes ok locally, but in my Github Actions it hangs indefinitely after printing "closing temp DB"

#[tokio::test]
async fn pgtemp_test() {
    use pgtemp::PgTempDB;
    use sqlx::postgres::PgPoolOptions;

    let temp_db = PgTempDB::async_new().await;
    println!("got  temp DB");
    let address = format!(
        "postgresql://{}:{}@localhost:{}",
        temp_db.db_user(),
        temp_db.db_pass(),
        temp_db.db_port()
    );

    let pool = PgPoolOptions::new()
        .connect(&address)
        .await
        .expect("Can't connect to postgres");

    let conn = pool.acquire().await.unwrap();
    drop(conn);

    println!("starting teardown");
    pool.close().await;
    println!("pool2: size: {} idle: {}", pool.size(), pool.num_idle());
    drop(pool);
    println!("closing temp DB");
    drop(temp_db);
    println!("teardown complete");
}

@boustrophedon
Copy link
Owner

Per #10 the issue is I was dumb with how I designed the shutdown function - probably what happened was that I wrote shutdown as fn shutdown(mut self) but then I wanted to use it inside drop, which is implemented with fn drop(&mut self) so I thought "oh I'll just change the signature of shutdown". However, it's obvious in hindsight that shutdown will then be called twice.

There are a lot of options here to fix this and pretty much all of them are breaking changes (which tbh isn't that big a deal since it's not like I have a ton of users). Probably simplest solution (and also doesn't change the API) is the one you provided but give me a bit to think about it.

@nihohit
Copy link
Author

nihohit commented Jul 11, 2024 via email

@boustrophedon
Copy link
Owner

So the PR doesn't solve the hanging issue?

@nihohit
Copy link
Author

nihohit commented Jul 11, 2024

nope, it solves a different issue.

@boustrophedon
Copy link
Owner

Taking another look at this I wonder if the older version of pg in CI is different than what we have locally and maybe the behavior on sigkill is different, and a newer pg does better cleanup?

I can't get this to race locally with 1 or 100 connections.

@nihohit
Copy link
Author

nihohit commented Aug 17, 2024

I don't think it's a version mismatch - checked this with CI & local versions both 16.3.
It could be an OS difference - CI is Ubuntu, local is MacOS.

@boustrophedon
Copy link
Owner

I'm also on linux with 16.3. If you run the test in release mode does it hang in CI still?

@nihohit
Copy link
Author

nihohit commented Aug 18, 2024 via email

@boustrophedon
Copy link
Owner

The newest release, v0.5.0, uses fast shutdown instead of sending SIGKILL. Could you check and see if this issue still occurs with the newest version?

@jacobsvante
Copy link

Did you get this to work?

@nihohit
Copy link
Author

nihohit commented Feb 5, 2025

No, I tried it once, the issue seemed to have persisted, but I didn't have the time to investigate.
Since then I've moved to another team, so I probably won't have another opportunity to test this, so @boustrophedon I guess the issue can be closed.

@jacobsvante
Copy link

jacobsvante commented Feb 5, 2025

Thanks @nihohit for the quick reply. I'm receiving the same error as you originally posted, so I don't think the issue should be closed.

@boustrophedon Have you been able to get it working in Github actions?

Update: Issue persists if I change to PgTempDB::new() instead of PgTempDB::async_new().

@boustrophedon
Copy link
Owner

@jacobsvante This is still happening on pgtemp 0.5?

@jacobsvante
Copy link

Yes, it appears so.

Do you have any integration tests?

@boustrophedon
Copy link
Owner

Yes, they're in the tests/ directory. I just added the sample test from this issue so we'll see what happens: https://github.com/boustrophedon/pgtemp/actions/runs/13167858114

@boustrophedon
Copy link
Owner

boustrophedon commented Feb 5, 2025

 Running tests/shutdown.rs (target/debug/deps/shutdown-9f5c6b24a836d5ff)

running 1 test
test pgtemp_shutdown_test ... ok

The sample provided passed in CI (and then the build failed due to a linting error in the python client tests)

@jacobsvante
Copy link

The issue was that I had not read the requirements section of the readme, sorry a bit much on my plate lately...

I had forgot to add these lines:

      - name: Install postgres
        run: sudo apt-get install postgresql postgresql-client
      - name: Update path
        run: find /usr/lib/postgresql/ -type d -name "bin" >> $GITHUB_PATH

I'll see if I can find some time to add a check for this when running pgtest. Would make it less likely for stressed people like me being confused about why the tests are running for a long time and then just fail with a vague error message 😅

What is causing the issue btw?

@boustrophedon
Copy link
Owner

I don't know what would have caused your tests to hang - If I remove the postgres install and PATH lines, the tests simply error out. https://github.com/boustrophedon/pgtemp/actions/runs/13188950120/job/36817698923

@jacobsvante
Copy link

Strange. Maybe it was two tests running simultaneously. I'll see if I can find what was causing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants