Our GH workflow is configured to run tests automatically upon pushes.
Recently, I switched our test backend db on Github Actions from SQLite to Postgres. Since then we get errors like this, often but not every time:
psycopg2.OperationalError: could not connect to server: Connection refused
Is the server running on host "db" (172.19.0.2) and accepting
TCP/IP connections on port 5432?
Sometimes the jobs work the first time, or I can re-run the same failing job some hours later and it will succeed. The failures are intermittent.
The YAML for the GH workflow is here: failng-pg-jobs - Pastebin.com
I find a discussion here python 3.x - Django was unable to create a connection to the 'postgres' database and will use the default database instead - Stack Overflow where someone had a similar issue and was advised to introduce a package called wait-for-it
to ensure the db was fully initialized.
I can try this, and will if I don’t find any bettter ideas, but I’m using --health-cmd pg_isready
already (see the YAML above) and I’m not crazy about just throwing more tools for delay into the mix.
Anybody have ideas or suggestions?