-
Hi, out of a sudden one of my github workflow runners are receiving the following strange error message:
The strange thing here is, that the other build jobs within the matrix I use are running completly fine and without any issues. See here, e.g.: https://github.com/jens-maus/RaspberryMatic/actions/runs/38504999 As you can see, only the “ova” job is actually causing this strange “No space left on device” error which then ends up in CI error on the whole project. Of course, I also tried to discuss this in the issue tracker of the actions/upload-artifact action (actions/upload-artifact#9 (comment)) but haven’t received any response from the github authors yet. In addition. I somehow have the feeling that this might point to a completly different failure within the GitHub Actions framework as only some days ago the same upload-artifact action worked fine even for the “ova” matrix build. See here: https://github.com/jens-maus/RaspberryMatic/actions/runs/36918297 So, now I wonder if someone in here might have an idea what could be the reason for this strange “No space left on device” error?!? Please note, that I of course made sure that old build artifacts are being deleted automatically so I guess this error might not come from some project-wise space limitations within the artifact storage. Any help is highly appreciated! |
Beta Was this translation helpful? Give feedback.
Replies: 13 comments
-
I have helped you report this issue to the appropriate engineering team, they will evaluate and investigate the issue. If they have any update, I will notify you in time. Or the appropriate engineers may also directly reply to you here. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the reply. I just returned from my vacation and I noticed that my nightly builds still fail with these strange “No space left on device” errors. See for example here: https://github.com/jens-maus/RaspberryMatic/actions/runs/43667982 And this time this seems to be unrelated to the actions/upload-artifact action?!?! So what is the current state of affairs on the investigation of the GitHub engineers on that “No space left on device” errors!??! |
Beta Was this translation helpful? Give feedback.
-
@brightran And see/compare the following two runs of GitHub Actions on the very same commit id while the first one run without the “No space left on device” error and the second one failed!?!? Succeeded without any errors: https://github.com/jens-maus/RaspberryMatic/actions/runs/40438818 Failed on the same commit id (only one day later) where I can’t see any reason why it might have failed: https://github.com/jens-maus/RaspberryMatic/actions/runs/40977055 |
Beta Was this translation helpful? Give feedback.
-
Hi @jens-maus, It looks like you might be running out of disk space on the hosted machine. We guarantee 10 GBs on our our runners. Does it seem likely that you could be exceeding this? You could add a space to check on space beteen existing steps to get a sense of this (“df -h” would work for linux). |
Beta Was this translation helpful? Give feedback.
-
Hi @elbrenn that’s actually a good hint. In fact, my build jobs could easily occupy ~20GB auf temporary disk space until the build is finished. And when I look at the disk space before my build step I can see the following:
So there seem to be around 20GB of free disk space available, right? However, the final question remains why this was previouly working without any issues? Is this a somewhat new limitation here?!? And is there a way to get more temporary disk space on the GitHub runners so that my build jobs can succeed?!? |
Beta Was this translation helpful? Give feedback.
-
In practice the runners have higher specs than we guarantee to make sure we are delivering on our promises. It’s possible that a previous version had more than 20GB or that your other runs used less for some other reason. Here is a doc about the resources on our hosted runners: https://help.github.com/en/actions/reference/virtual-environments-for-github-hosted-runners#supported-runners-and-hardware-resources. Have you looked into using self-hosted runners for your workflow? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the links. It’s a pity that the GitHub runners are obviously not supporting more than 20GB disk space because apart from that problem I am quite happy with GitHub Actions. Hopefully the disk space limit will be increased at some point because I would prefer to stay with GitHub provided runners rather than setting up self-hosted runners as my projects are small scale open source projects where I can not afford to have a self hosted runner ready at all times to build nightly build archives, etc. |
Beta Was this translation helpful? Give feedback.
-
As you have saw in the docs, currently the space is less than 20 GB. If your projects really need more space on the runner, you also can report a feature request here. That will allow you to directly interact with the appropriate engineering team, and make it more convenient for the engineering team to collect and categorize your suggestions. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the hint and link, I will try to create an appropiate ticket ASAP. However, I was actually able to workaround the disk space issue by ensuring that the runner OS environment is cleaned up as much as possible right before the actual build step is executed. After that cleanup the runners seem to have ~33GB of disk space available (compared to ~20GB) and thus my build jobs proceed now without any issues. See here for the relevant workflow steps I added to cleanup the whole environment: This includes:
All these cleanup steps seem to free about 13GB of disk space which are then enough that my build jobs can finish correctl.y |
Beta Was this translation helpful? Give feedback.
-
You don’t have to remove docker Without removing docker, maybe the following is a good solution
|
Beta Was this translation helpful? Give feedback.
-
If you want to keep swap and only need docker and some basics, you may want to remove a bunch of pkgs. |
Beta Was this translation helpful? Give feedback.
-
Thanks for posting the solutions ITT. I just hit this issue today. It manifested as a stalled runner, with no output in the web interface. The only way to determine something was wrong was when I went to reboot the SHR and it didn’t start the runner service. When I ran I solved got here by way of this commit. I just ran those commands manually and rebooted and the runner went into idle status again. I’m curious what kind of cache performance hit is likely if these were run at each build. |
Beta Was this translation helpful? Give feedback.
@brightran ,
Thanks for the hint and link, I will try to create an appropiate ticket ASAP.
However, I was actually able to workaround the disk space issue by ensuring that the runner OS environment is cleaned up as much as possible right before the actual build step is executed. After that cleanup the runners seem to have ~33GB of disk space available (compared to ~20GB) and thus my build jobs proceed now without any issues.
See here for the relevant workflow steps I added to cleanup the whole environment:
https://github.com/jens-maus/RaspberryMatic/blob/d5044bef3307bc61166377c162569de1a61cf332/.github/workflows/ci.yml#L34-L40
This includes: