Hi, please I need help.
I have been pushing docker images in github packages and reading them from my kubernetes cluster for a couple months now and all was good.
However since 24 hours ago (starting about Thursday 10 december 2020) the cluster would pull images from github packages with random success. It pulls images successfully like 70% of the time but the remaining times it fails like this:
[Pulling] Pulling image "docker.pkg.github.com/account/my/image:latest" [Failed] Failed to pull image "docker.pkg.github.com/account/my/image:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://docker.pkg.github.com/v2/docker.pkg.github.com/account/my/image/manifests/latest: EOF [Failed] Error: ErrImagePull [Pulled] Container image "jenkins/inbound-agent:4.3-4" already present on machine [Created] Created container jnlp [Started] Started container jnlp [Failed] Error: ImagePullBackOff [BackOff] Back-off pulling image "docker.pkg.github.com/account/my/image:latest" [Failed] Error: ImagePullBackOff ERROR: Unable to pull Docker image "docker.pkg.github.com/account/my/image:latest". Check if image tag name is spelled correctly. [Pipeline] // node [BackOff] Back-off pulling image "docker.pkg.github.com/account/my/image:latest"
But then I run again the same job and it gets the image just fine.
I checked the kubernetes nodes in case one node in the cluster was failing but I noticed the job can fail in one node then succeed a couple minutes later in the same node one run after the previous one.
I rotated the nodes for new ones just in case there would be some misconfiguration but the result was the same.
Notice the image exists in the nodes so docker just needs to check if there is a new version available. It doesn’t really need to pull new images… but it still fails.
Also notice in the lines above it does checks the image jenkins/inbound-agent:4.3-4 in the docker hub and that step runs successfully so I don’t think its the docker agent. Also while restarting, all the nodes did setup images imported from the docker hub and all was good. So docker hub runs fine.
Then the only conclusion I have is there is something wrong with github packages. No way something working just fine starts failing with the errors above from nowhere. Then it fails one second and the next it succeeds. Also all push and pulls commands to and from docker hub are ok, why the problem appears with github packages only?
So please tell me how can we debug this. Github Status page says github packages works just fine but all this sudden problems are causing us (me) a lot of problems.