So I looked at our telemetry and I couldn’t find anything that would indicate a problem on our end.
Generally, even if there is some degradation reported on https://www.githubstatus.com/ your workflow should behave the same if it starts succesfully (during degradation, you would normally see something like delays or problems when starting workflows or issues streaming logs). Each time however there is a fresh VM with the same pre-installed software so once it starts it should be the same and any issues happening on github.com should not affect users that are doing things such as deployments to AWS.
I’m not familar with AWS at all, but our images do have a preinstalled version of the CLI. I would suggest removing the third party action to set it up and try out the one that comes preinstalled (sometimes extra installs can cause conflicts). If the preinstalled one needs to be updated, you can file in issue in actions/virtual-environments but i quickly looked around and I didn’t see any issues related to the AWS CLI for Ubuntu. Up to date info on what is pre-installed on our runners can be found here: https://help.github.com/en/actions/reference/software-installed-on-github-hosted-runners
Turning on step debugging can also display some extra information: https://github.com/actions/toolkit/blob/master/docs/action-debugging.md#how-to-access-step-debug-logs
A bit to hard debug the make deploy script past this point With our first party actions we tend to split up all the steps so we do something like npm install first, then npm test, npm format etc. so that everything isn’t done in a single step (makes the logs nicer too). Maybe that would help with debugging.
Another possible suggestion, if it can’t find certain files, then I would double check that the correct paths are being used. You can try using the working-directory yaml parameter: https://help.github.com/en/actions/reference/workflow-syntax-for-github-actions#defaultsrun
Might be good as well to check the aws cli repo for any issues, always possible that others are having a similar issue. https://github.com/aws/aws-cli
Seems a little weird that if you get an error such as
The operation was canceled
the script continues and it’s effectivly wasting time. It might be waiting for something to finish, but I don’t know. If you do something like “return 1” in a shell script is should immediatly fail and the job should terminate so I would investigate why that isn’t the case.
Overall though, I don’t think this is indicative of any problem on our side. If you frequntly see network issues between the VM and AWS or any generic configuration problems with the VM, then we can take a look but I don’t think there is much more we can do.