Random IO failures during nightly actions

We’re seeing random IO failures on actions for both Windows and Linux containers each night. This is a four job matrix (JDK 8 and 11 on Windows and Linux). It isn’t always the same job that does, and sometimes it is two jobs. However, pretty much every night it dies due to some IO operation failing.

https://github.com/MegaMek/megameklab/actions/runs/108718941
https://github.com/MegaMek/megamek/actions/runs/108710481
https://github.com/MegaMek/mekhq/actions/runs/104131369

Plenty more examples if needed.
Ideas?

Hi @sixlettervariables ,

Thank you for reaching this out!

The failure is always related to build with Gradle, it’s recommended to raise a ticket in the issue list for confirmation.

Meanwhile, you can restart the workflow for a try, please check my answer about how to automatically restart the failed workflow.

Regards.

What issue are you seeing with Gradle specifically? Reviewing the logs I see failures to copy files and failures to delete files. This would point to some limitation or problem with the underlying container image.

If I post on the Gradle tracker that my build dies sometimes due to random IO failures, they’re going to ask me to evaluate my underlying hardware.

Hi @sixlettervariables ,

Thanks for your quick reply! I notice the IO error from your workflow log, since it’s an intermittent issue, i assume it could be related to gradle, or related to the hosted runner which you can confirm in the github virtual environment issue list. 

Currently we can restart the workflow to overcome the error.

Regards.

Random IO failures across different actions is not related to Gradle. The main MegaMek repo executes identical gradle actions and never fails in that manner. We have a substantially similar workflow that executes on every PR and never has those failures either.

This is related to GitHub Actions, perhaps the runner as you suggest. I’ve opened a bug report with them.

At this point there is not an answer, so we’ll hold this open until an answer is available.