ImagePullBackOff error backoff github packages azure aks - azure

I am deploying my services Azure AKS. I am running into an issue where I get a ImagePullBackOff error. Here is some context.
I have 2 nodepools one with --enable-node-public-ip option and another one without the node public-ip enabled option. I am trying to deploy a Daemonset resource. The container image is hosted on GitHub package registry. The issue is, the nodes that don't have a public IP enabled are successfully able to pull the images whereas, the nodes that have ppublic-ip enabled has an error.
Here is the error:
Failed to pull image "docker.pkg.github.com/xyz": rpc error: code = NotFound desc = failed to pull and unpack image "docker.pkg.github.com/xyz"
I would appreciate help on this.

ImagePullBackOff and ErrImagePull indicate that the image used by a container cannot be loaded from the image registry. Make sure you don't have a typo in image definition.
Try docker pull docker.pkg.github.com/OWNER/REPOSITORY/IMAGE_NAME:TAG_NAME beforehand. Afterwards, you should give docker.pkg.github.com/OWNER/REPOSITORY/IMAGE_NAME:TAG_NAME in Daemonset definition.
It is not an authentication issue.

Related

ImagePullbackOff : minikube k8s

i'm trying to add spark to minikube with using this blog as tutorial to help me : https://medium.com/#rewelle/d%C3%A9ploiement-dune-architecture-compl%C3%A8te-big-data-avec-kubernetes-570eaa0e627
i get ImagePullBackOff like status when he tried to add the pod : spark-standalone-worker-1.
And that's what i get when i run : kubectl describe pods sparl-standalone-worker-1 :
The status ImagePullBackOff means that a Pod couldn’t start, because Kubernetes couldn’t pull a container image. The ‘BackOff’ part means that Kubernetes will keep trying to pull the image, with an increasing delay (‘back-off’).
This is a very common reason for ImagePullBackOff since Docker introduced rate limits on Docker Hub. Once you hit your maximum download limit on Docker Hub, you’ll be blocked and this might cause your ImagePullBackOff error. You’ll either need to sign in with an account, or find another place to get your image from.
This error can also happen if your registry requires SSL/TLS authentication, but you don’t trust its certificate. Make sure you follow the instructions to set up TLS authentication.
Failed to pull image ... authentication required
You need to create a Secret for the Docker Registry, or you have been rate-limited from pulling new containers for a while.

Failed to pull image - Azure AKS

I have been following this guide to deploy application on Azure using AKS
Every thing was fine until I deployed, one node is in not ready state with ImagePullBackOff status
kubectl describe pod output
Performing below command I get success command, so I am sure authentication is happening
az acr login --name siddacr
and this command lists out the image which was uploaded
az acr repository list --name <acrName> --output table
I figured out.
The error was in the name of the image in deployment.yml file
imagebackpulloff might be caused because of the following reasons:
The image or tag doesn’t exist
You’ve made a typo in the image name or tag
The image registry requires authentication
You’ve exceeded a rate or download limit on the registry

Pull images from an Azure container registry to a Kubernetes cluster

I have followed this tutorial microsoft_website to pull images from an azure container. My yaml successfully creates a pod job, which can pull the image, BUT only when it runs on the agentpool node in my cluster.
For example, adding nodeName: aks-agentpool-33515997-vmss000000 to the yamlworks fine, but specifying a different node name, e.g. nodeName: aks-cpu1-33515997-vmss000000, the pod fails. The error message I get with describe pods is Failed to pull image and then kubelet Error: ErrImagePull.
What I'm missing?
Create secret:
kubectl create secret docker-registry <secret-name> \
--docker-server=<container-registry-name>.azurecr.io \
--docker-username=<service-principal-ID> \
--docker-password=<service-principal-password>
As #user1571823 told solution to the problem is deleting the old image from the acr and creating/pushing a new one.
The problem was related to some sort of corruption in the image saved in the azure container registry (acr). The reason why one agent pool could pulled the image was actually because the image already existed in the VM.
Henceforth as #andov said it is good option to open an incident case to Azure support for AKS from your subscription, where AKS is deployed. The support team has full access to the AKS service backend and they can tell exactly what was causing your problem.
Four things to check:
Is it a subscription issue? Are the nodes in different subscriptions?
Is it a rights issue? Does the service principle of the node have rights to pull the image.
Is it a network issue? Are the nodes on different subnets?
Is there something with the image size or configuration, that means that it cannot run on the other cluster.
Edit
New-AzAksNodePool has a parameter -DefaultProfile
It can be AzContext, AzureRmContext, AzureCredential
If this is different between your nodes it would explain the error

Azure app Service not picking up Gitlab Container Registry Configuration as Private Repository

I have an application container pushed to a gitlab container registry. I am trying to deploy it into azure web app service as a container. I did the configuration as best as I could understand based on the documentation from azure. But I don't understand what I am missing because azure logs show azure still trying to connect to docker hub registry.
In the logs I get the following
2019-05-13 09:21:49.741 ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://registry-1.docker.io/v2/library/<image-name>/manifests/latest: unauthorized: incorrect username or password"}
2019-05-13 09:21:49.743 ERROR - Pulling docker image <image-name> failed:
2019-05-13 09:21:49.743 INFO - Pulling image from Docker hub: library/<image-name>
2019-05-13 09:21:50.795 ERROR - DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for <image-name>, repository does not exist or may require 'docker login'"}
2019-05-13 09:21:50.797 ERROR - Image pull failed: Verify docker image configuration and credentials (if using private repository)
Can anyone tell me what I might be doing wrong here? I believe the problem is the registry url config. Any help would be appreciated.
For anyone else who faces the same problem, my problem was that I was giving only the image name, had to put the full name of the image registry registry.gitlab.com/<group name>/<image name>
Since it took me a while to fiddle that together, this is what helped me most. (Thanks #Hassaan for your answer that pointed me in the right direction)
GitLab
You need to be aware of your container registry located in GitLab.
Check image naming conventions for that.
If you are using GitLab Cloud registry URL is: https://registry.gitlab.com
Once you logged into GitLab check your project and navigate to "Packages & Registries" and click on "Container Registry".
If you already published an image, it will be shown there in the list. What you see there is what I will call "full-image-name".
It'll be most likely <namespace>/<project>/<image-name>.
We need that full-image-name later. Click on that image in that list to get a list of image tags. You will have to selecte a tag to use for later.
If you did not publish an image to that registry yet, check the docs to get started.
Azure Web App
Navigate to your App Service and find "Container settings (Classic)" in the left side menu.
Then simply fill in required data.
Server URL: https://registry.gitlab.com
Full Image Name and Tag: <full-image-name>:<tag> (see above)
If you are not sure what to put into login and password have a look at how to authenticate with the container registry.

Unable to pull image from Azure Container Registry

We recently had an issue with our Azure Kubernetes Cluster not reporting back any data through the Azure Portal. To fix this, I updated the Kubernetes version to the latest version as was recommended on GitHub. After the upgrade was complete, we were able to view logs and monitoring data through the portal, but one of the containers stored in our Azure Container Registry is not able to be pulled by the Kubernetes Cluster.
The error I see in the Kuberenetes Management page is:
Failed to pull image "myacr.azurecr.io/container:190305.191": [rpc error: code = Unknown desc = Error response from daemon: Get https://myacr.azurecr.io/v2/mycontainer/manifests/190305.191: unauthorized: authentication required, rpc error: code = Unknown desc = Error response from daemon: Get https://myacr.azurecr.io/v2/mycontainer/manifests/190305.191: unauthorized: authentication required]
My original setup used the first script provided in this document and it worked correctly without issue. Once I started receiving the error, I ran it again just to make sure.
Once I saw that failed, I then deleted the account from the permissions on both the ACR and the AKS. Again, it failed to pull the image.
After that, I tried using the second method of creating an Kubernetes secret and received the same error.
At this point, I'm unsure what else to check. I've verified that I can run docker pull on my machine and pull the image, but there seems to be a breakdown between the AKS and the ACR that I can not sort out.
It's been a while since I originally posted this, but I did stumble across a currently stable solution to the problem.
The service principal, for whatever reason, is not able to maintain a connection to the ACR. So if your cluster ever goes down, you lose the ability to pull from the ACR. I had this happen multiple times over the last year and as I moved more of my Kubernetes deployment to Azure, it became a bigger and bigger issue.
I stumbled across this Microsoft Doc and noticed the mention of the --attach-acr command.
This is what the full command looks like:
az aks create -n myAKSCluster -g myResourceGroup --generate-ssh-keys --attach-acr $MYACR
Since setting it up with that flag, I have had 0 issues with it.
knock on wood

Resources