Help
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Copilot Lvl 2
Message 1 of 5

Random IO failures during nightly actions

We're seeing random IO failures on actions for both Windows and Linux containers each night. This is a four job matrix (JDK 8 and 11 on Windows and Linux). It isn't always the same job that does, and sometimes it is two jobs. However, pretty much every night it dies due to some IO operation failing.

https://github.com/MegaMek/megameklab/actions/runs/108718941
https://github.com/MegaMek/megamek/actions/runs/108710481
https://github.com/MegaMek/mekhq/actions/runs/104131369

Plenty more examples if needed.
Ideas?

4 Replies
Highlighted
GitHub Partner
Message 2 of 5

Re: Random IO failures during nightly actions

Hi @sixlettervariables ,

 

Thank you for reaching this out!

The failure is always related to build with Gradle, it's recommended to raise a ticket in the issue list for confirmation.

Meanwhile, you can restart the workflow for a try, please check my answer about how to automatically restart the failed workflow.

 

Regards.

Highlighted
Copilot Lvl 2
Message 3 of 5

Re: Random IO failures during nightly actions

What issue are you seeing with Gradle specifically? Reviewing the logs I see failures to copy files and failures to delete files. This would point to some limitation or problem with the underlying container image.

If I post on the Gradle tracker that my build dies sometimes due to random IO failures, they're going to ask me to evaluate my underlying hardware.
Highlighted
GitHub Partner
Message 4 of 5

Re: Random IO failures during nightly actions

Hi @sixlettervariables ,

 

Thanks for your quick reply! I notice the IO error from your workflow log, since it's an intermittent issue, i assume it could be related to gradle, or related to the hosted runner which you can confirm in the github virtual environment issue list. 

Currently we can restart the workflow to overcome the error.

 

Regards.

Highlighted
Copilot Lvl 2
Message 5 of 5

Re: Random IO failures during nightly actions

Random IO failures across different actions is not related to Gradle. The main MegaMek repo executes identical gradle actions and never fails in that manner. We have a substantially similar workflow that executes on every PR and never has those failures either.

 

This is related to GitHub Actions, perhaps the runner as you suggest. I've opened a bug report with them.

 

At this point there is not an answer, so we'll hold this open until an answer is available.