Azure Container Instances cannot access to docker images - azure

I just recently had a problem when trying to deploy the new docker images to Azure Container Instances. The Azure portal return this message when I was trying to restart the container groups:
'BadRequest':'InaccessibleImage':'The image 'xxxx' in container group 'xxx' is not accessible. Please check the image and registry credential.
The same error showed up when I used the "az container create" command. I am sure that the credential is correct and everything worked fine a couple of days ago.
I am not sure if there is a connection problem from Azure to Docker or if there are any breaking changes in Azure Container Instances. I would greatly appreciate it if someone could help me out.

I've been having the same problem for the last 6 hours as well. I believe there is an issue with Azure but I do not know how to verify this at the moment.

Related

Azure App service Keeps pulling docker image from docker hub

I have a azure app service to host a docker image from out Azure Container Registry.
The full process is as follow:
Run Pipeline
Run Release pipeline
Azure app pulls the latest release from azure container registry
But what happen is that after Each realise, for some reason, the app service tries to pull the image from Docker Hubinstead of pulling from azure Container Registry.
Can somebody help to understand where is the issue here?
For your issue, I can guess the problem you made, you must set the image with the tag as, for example, nginx:latest. But if you push the image in the ACR and need to pull it from the ACR, you must set the image with the tag as myacr.azurecr.io/nginx:latest. In addition, you also need to configure the credential for your ACR.

Pull images from an Azure container registry to a Kubernetes cluster

I have followed this tutorial microsoft_website to pull images from an azure container. My yaml successfully creates a pod job, which can pull the image, BUT only when it runs on the agentpool node in my cluster.
For example, adding nodeName: aks-agentpool-33515997-vmss000000 to the yamlworks fine, but specifying a different node name, e.g. nodeName: aks-cpu1-33515997-vmss000000, the pod fails. The error message I get with describe pods is Failed to pull image and then kubelet Error: ErrImagePull.
What I'm missing?
Create secret:
kubectl create secret docker-registry <secret-name> \
--docker-server=<container-registry-name>.azurecr.io \
--docker-username=<service-principal-ID> \
--docker-password=<service-principal-password>
As #user1571823 told solution to the problem is deleting the old image from the acr and creating/pushing a new one.
The problem was related to some sort of corruption in the image saved in the azure container registry (acr). The reason why one agent pool could pulled the image was actually because the image already existed in the VM.
Henceforth as #andov said it is good option to open an incident case to Azure support for AKS from your subscription, where AKS is deployed. The support team has full access to the AKS service backend and they can tell exactly what was causing your problem.
Four things to check:
Is it a subscription issue? Are the nodes in different subscriptions?
Is it a rights issue? Does the service principle of the node have rights to pull the image.
Is it a network issue? Are the nodes on different subnets?
Is there something with the image size or configuration, that means that it cannot run on the other cluster.
Edit
New-AzAksNodePool has a parameter -DefaultProfile
It can be AzContext, AzureRmContext, AzureCredential
If this is different between your nodes it would explain the error

Docker fails to pull the image from within Azure App Service

The Container Setting on the App Service it self look solid:
But the log pane shows errors:
2020-02-11 06:31:40.621 ERROR - Image pull failed: Verify docker image configuration and credentials (if using private repository)
2020-02-11 06:31:41.240 INFO - Stoping site app505-dfpg-qa2-web-eastus2-gateway-apsvc because it failed during startup.
2020-02-11 06:36:05.546 INFO - Starting container for site
2020-02-11 06:36:05.551 INFO - docker run -d -p 9621:8081 --name app505-dfpg-qa2-web-eastus2-gateway-apsvc_0_a9c8277e_msiProxy -e WEBSITE_SITE_NAME=app505-dfpg-qa2-web-eastus2-gateway-apsvc -e WEBSITE_AUTH_ENABLED=False -e WEBSITE_ROLE_INSTANCE_ID=0 -e WEBSITE_HOSTNAME=app505-dfpg-qa2-web-eastus2-gateway-apsvc.azurewebsites.net -e WEBSITE_INSTANCE_ID=7d18d5957d129d3dc3a25d7a2c85147ef57f1a6b93910c50eb850417ab59dc56 appsvc/msitokenservice:1904260237
2020-02-11 06:36:05.552 INFO - Logging is not enabled for this container.
Please use https://aka.ms/linux-diagnostics to enable logging to see container logs here.
2020-02-11 06:36:17.766 INFO - Pulling image: a...cr/gateway:1.0.20042.2
2020-02-11 06:36:17.922 ERROR - DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for a...cr/gateway, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}
2020-02-11 06:36:17.923 ERROR - Pulling docker image a...cr/gateway:1.0.20042.2 failed:
2020-02-11 06:36:17.923 INFO - Pulling image from Docker hub: a...cr/gateway:1.0.20042.2
2020-02-11 06:36:18.092 ERROR - DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for a...cr/gateway, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}
2020-02-11 06:36:18.094 ERROR - Image pull failed: Verify docker image configuration and credentials (if using private repository)
2020-02-11 06:36:19.062 INFO - Stoping site app505-dfpg-qa2-web-eastus2-gateway-apsvc because it failed during startup.
The Service Principal used to deploy the App Service has AcrPush access to the parent resource group of the container registry:
The setting are present:
I did az login with that service principal and then tried az acr login to the registry. It works fine. So what am I missing here?
EDIT 1
I know the credentials are correct, because I tested them like this:
Where I just copied the values from the app service configuration and pasted on the console. docker has no problem logging in.
It must be something else.
EDIT 2
However, I also get this:
C:\Dayforce\fintech [shelve/terraform ≡]> docker pull a...r/gateway
Using default tag: latest
Error response from daemon: pull access denied for a...r/gateway, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
So, I can login, but not pull. Very strange, because the account is configured to have AcrPush access to the container, which includes AcrPull:
EDIT 3
I was able to pull successfully when using the FQDN for the registry:
I updated the pipeline, but I still get the same errors:
2020-02-11 16:03:50.227 ERROR - Pulling docker image a...r.azurecr.io/gateway:1.0.20042.2 failed:
2020-02-11 16:03:50.228 INFO - Pulling image from Docker hub: a...r.azurecr.io/gateway:1.0.20042.2
2020-02-11 16:03:50.266 ERROR - DockerApiException: Docker API responded with status code=InternalServerError, response={"message":"Get https://a...r.azurecr.io/v2/gateway/manifests/1.0.20042.2: unauthorized: authentication required"}
2020-02-11 16:03:50.269 ERROR - Image pull failed: Verify docker image configuration and credentials (if using private repository)
2020-02-11 16:03:50.853 INFO - Stoping site app505-dfpg-qa2-web-eastus2-gateway-apsvc because it failed during startup.
EDIT 4
The only way that I found working was to enable the Admin User on the ACR and pass its credentials in the DOCKER_... variables instead of credentials of the Service Principal.
This is frustrating, I know the Service Principal can login and pull when ran locally, it is a mystery why it does not work for docker running on an App Service Host. We have another team here which faced the same issue and they have not found any solution, but enable the Admin User.
EDIT 5
The entire process runs as part of the Azure DevOps on-prem release pipeline using a dedicated Service Principal. Let me call it Pod Deploy Service Principal or just SP for short.
Let DOCKER_xyz denote the three app settings controlling the docker running on the App Service host:
DOCKER_REGISTRY_SERVER_URL
DOCKER_REGISTRY_SERVER_USERNAME
DOCKER_REGISTRY_SERVER_PASSWORD
I think we need to distinguish two parts here:
App Service needs to talk to the ACR in order to pull from it the details about the image and present them in this GUI - For that to work, the SP must have the AcrPull role in the ACR. Failure to do so results in the GUI presenting a spinning icon for the Image and Tag rows. I stumbled on it before - How to configure an Azure app service to pull images from an ACR with terraform? Now the answer to that question suggests that I have to assign the AcrPull role and set the DOCKER_xyz app settings. I think that the DOCKER_xyz app settings are not for that, but for the second part.
It seems to me that when an App Service is started, the host uses docker to actually pull the right image from the ACR. This part seems to be detached from (1). For it to work, the app settings must have the DOCKER_xyz app settings.
My problem is that part (1) works great, but part (2) does not even if DOCKER_xyz app settings specify the credentials of the SP from part (1). The only way I could make it work if I point DOCKER_xyz at the Admin User of the ACR.
But that why on Earth the DOCKER_xyz app settings cannot point to the pipeline SP, which was good enough for the part (1)?
EDIT 6
The current state of affairs is this. Azure App Service is unable to communicate with an ACR except using ACR admin user and password. So, even if the docker runtime running on the App Service host machine may know how to login using any service principal, the App Service would not use any identity or Service Principal to read metadata from the ACR - only admin user and password. The relevant references are:
https://feedback.azure.com/forums/169385-web-apps/suggestions/36145444-web-app-for-containers-acr-access-requires-admin#%7btoggle_previous_statuses%7d
https://github.com/MicrosoftDocs/azure-docs/issues/49186
On a personal note I find it amazing that Microsoft recommends not to use ACR admin user, yet a very core piece of their offering, namely Azure App Service, depends on it being enable. Makes me wonder whether different teams in Microsoft are aware of what others are doing or not doing...
App service started pulling after doing these steps for me. :D
Enable Admin Access in Azure Container Registry
In the App service configuration, provide container registry admin credentials
DOCKER_REGISTRY_SERVER_PASSWORD(admin enabled password),
DOCKER_REGISTRY_SERVER_USERNAME(crxxxxxx),
DOCKER_REGISTRY_SERVER_URL (https://crxxxxxx.azurecr.io)
Go to your app service and select identity section on the left, and click on system assigned - change status to On.
Now go to IAM Control container registry, add ACR pull role to App Service system assigned identity enabled on step 3.
Restart your App Service and wait .Changes will take few minutes to reflect so refresh your logs. (10 minutes or more)
Good luck :)
After a lot of research I figured out a way to resolve this without enabling Admin user
Create an app registration using Azure Active Directory and store the secret somewhere.
Go to the Azure container registry and add role assignment to this newly created app with permissions of AcrPush (which also contains AcrPull).
In the App service configuration, replace the variables .
DOCKER_REGISTRY_SERVER_PASSWORD with Client Secret of app registration which was saved in the first step
DOCKER_REGISTRY_SERVER_USERNAME with client Id of App registration
This should solve the Docker Api exception.
It's baffling that this is not mentioned in any Azure Container Registry documentation. Although I think it is mentioned somewhere in AAD documentation indirectly 😐.
From the message I got of the talk, let me solve your puzzle about the error.
I guess you deploy the image in ACR to the Web App through the Azure portal. When you use the Azure portal to deploy the Web App from the ACR, it only lets you select the ACR and image and tag, but do not let you set the credential. In this way, Azure will set it itself with the admin user and password if you enable the admin user. If you do not enable it, the error you got happens.
And if you want to use the service principal, I recommend you use the other tools, such as Azure CLI. Then you can set the docker registry credential yourself with the command az webapp config container set.
Here is the example and it works fine on my side:
With the Azure CLI, you can follow the steps here.
Update:
Here are the screenshots of the test on my side:
Found the answer by setting "acrUseManagedIdentityCreds" to True. The second command in this comment: https://stackoverflow.com/a/69120462/17430834
Edit 1: Adding the command
Here is the command that you will need to run to make this change.
az resource update --ids /subscriptions/<subscription-id>/resourceGroups/myResourceGroup/providers/Microsoft.Web/sites/<app-name>/config/web --set properties.acrUseManagedIdentityCreds=True
I was trying to do the same from Azure DevOps pipelines and got the same problem.
I didn't find out how to make it work using the ACR name, but it works if you use your_acr_name.azurecr.io instead.
If you go to the Access Keys page of your ACR you will find two values
Registry name: MyCoolRegistry (doesn't work if you use this one)
Login server: mycoolregistry.azurecr.io
The login server is working - just put it as the containerRegistry in your Pipeline without creating a service connection.
Just in case someone is struggling with that one.
Just to add to mark's amazing job of working it all through and for the fast readers: for everything to work, one of course also has to enable the admin user (who by default is disabled). For example by issuing:
az acr update -n <your-azureregistry-name> --admin-enabled true
on the console.
I experienced this same issue when trying to deploy an Docker application to Azure Web Apps for containers.
When I deployed the application I will get the error:
DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for a..my-repo/image, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}.
Here's how I solved it:
The issue was that I was not specifying the full path to the image. I was supposed to include my-registry-url in the docker image-name. That is instead of just image-name I was supposed to use my-registry-url/image-name, since I am trying to pull from a private repository.
So say these are variables:
docker image name is promiseapp
docker-registry_url is promisecicdregistry.azurecr.io
resource-group is dockerprojects
app-service-plan is dockerlinuxprojects
azure-web-app name is promiseapptest
docker-registry-user is test-user
docker-registry-password is 12345678
Then my command will be:
az webapp create --resource-group dockerprojects --plan dockerlinuxprojects --name promiseapptest --deployment-container-image-name promisecicdregistry.azurecr.io/promiseapp
az webapp config container set --resource-group dockerprojects --name promiseapptest --docker-custom-image-name promisecicdregistry.azurecr.io/promiseapp --docker-registry-server-url https://promisecicdregistry.azurecr.io --docker-registry-server-user test-user --docker-registry-server-password 12345678
In my case, I fixed the error by using the fully qualified Azure Container Registery name like this:
xwezi.azurecr.io
The previous value was
xwezi
When I deploy manually to App Services, I wouldn't get that error.
But, when I used Azure App Service deploy task to deploy the container to the App Service, the service won't work correctly.
And, the log stream will show the above errors.
Unfortunately, the error messages weren't helpful for me to find this out. But I hope this will save your time :)

Azure Container Registry in Azure Web App for Containers across subscriptions

I'm currently trying to set up an Azure Web App for Containers, linking it to a Azure Container Registry that lives inside a different subscription. That's why my initial thought was to use the Private Registrytab inside the Web apps Container Settings to enter the credentials of said Registry.
However when I save and reload the page the settings of the Azure Container Registry tab are now populated and the Private Registry tab is empty. The issue is, that I get now get following error:
2020-01-21 21:51:12.951 ERROR - DockerApiException: Docker API responded with status code=NotFound, response={"message":"pull access denied for cliswebapi, repository does not exist or may require 'docker login': denied: requested access to the resource is denied"}
I assume because no password was stored. How do I configure this properly?
While you use the private registry, the Azure Container Registry is also a private registry, and deploy to Web App for Containers, you need to set the environment variables here:
DOCKER_REGISTRY_SERVER_USERNAME - The username for the ACR server.
DOCKER_REGISTRY_SERVER_URL - The full URL to the ACR server. (For example, https://my-server.azurecr.io.)
DOCKER_REGISTRY_SERVER_PASSWORD - The password for the ACR server.
See more details in If you're using Azure Container Registry, you need to set some app settings.
And if you create multiple containers, all the images must be in the same registry. All in Docker Hub or Azure Container Registry. See more details in All images must use the same registry.
Update:
With the message that you deploy the Web App using the image in the ACR in a different subscription. It seems it's a bug in Web App and you can see the issue in the Github. And the suggestion is that maybe you can use the service principal for the ACR to authenticate and the steps here.
I have spend some time on this issue and figured it out. Here is my solution:
Assuming we are having two subscriptions, let's call them SUB-A and SUB-B, where we are having an Azure Container Registry in SUB-A (called azurebluedev in my example).
Now we'd like to create an App Service in SUB-B that pulls its image of our container registry by using the admin username.
It's critical that you use the correct format under Image and tag in the docker blade when creating the app service. It must follow the format url/image:tag (without https) otherwise you will run into the described problem. I was using image:tag format beforehand which didn't work.
This worked for me!

How to deploy pgadmin4 docker image on azure web app?

I am unable to run docker image dpage/pgadmin4 on azure web app (Linux) which is available on docker hub.
I have installed Docker in my Linux machine and was able to run that docker image locally. Then I created Web app in Azure with options as given below:
OS: Linux
Publish: Docker Image
App service plan: Linux app service
After creating web app, I added two env variables in App Settings section:
PGADMIN_DEFAULT_EMAIL : user#domain.com
PGADMIN_DEFAULT_PASSWORD : SuperSecret
Finally login screen is visible but when I enter above credentials, it doesn't work and keeps redirecting to login page.
Update: If login is working properly, screen appears as shown below.
!(pgadmin initial screen)
After several retries i once got an message (CSRF token invalid) displayed in the right-top corner of the login screen.
For CSRF to properly work there must be some serverside state? So I activated the "ARR affinity" in the "General Settings" on the azure "Configuration".
I also noticed in the explamples on documentation the two environment-variables PGADMIN_CONFIG_CONSOLE_LOG_LEVEL (which is in the example set to '10') and PGADMIN_CONFIG_ENHANCED_COOKIE_PROTECTION (which is in the example set to 'True').
After enabling "ARR" and setting PGADMIN_CONFIG_ENHANCED_COOKIE_PROTECTION to False the login started to work. I have no idea what PGADMIN_CONFIG_ENHANCED_COOKIE_PROTECTION is actually doing, so please take that with caution.
If thats not working for you, maybe setting PGADMIN_CONFIG_CONSOLE_LOG_LEVEL to 10 and enabling console debug logging can give you a clue whats happening.
For your issue, I do the test and find that it's really a strange thing. When I deploy the docker image dpage/pgadmin4 in Azure service Web App for Container through Azure CLI and set the app settings, there is no problem to log in with the user and password. But when I deploy it through the Azure portal, then I meet the same thing with you.
Not sure what is the reason, but the solution is that set the environment variables PGADMIN_DEFAULT_EMAIL and PGADMIN_DEFAULT_PASSWORD through the Azure CLI like below:
az webapp config appsettings set --resource-group <resource-group-name> --name <app-name> --settings PGADMIN_DEFAULT_EMAIL="user#domain.com" PGADMIN_DEFAULT_PASSWORD="SuperSecret"
If you really want to know the reason, then you can make feedback to Microsoft. Maybe it's a bug or some special settings.
Update
The screenshot of the test on my side here:

Resources