Update image of existing azure registry does not work - azure

I am new to docker and I am trying to update an existing web service on an azure website. After building the image, this is what I did:
docker login <regname>.azurecr.io # Successfully logged in
docker tag <myimage> <regname>.azurecr.io/<servicename>
docker push <regname>.azurecr.io/<servicename>
And this is what I get:
C:\Users\user> docker push <regname>.azurecr.io/<servicename>
The push refers to repository [<regname>.azurecr.io/<servicename>]
8338876046a2: Preparing
9b4cb369a379: Preparing
769a276cd781: Preparing
486305c59459: Preparing
c36e2873b733: Preparing
130ae36f8cc8: Preparing
bc6b4902b79e: Preparing
f3d44e887388: Preparing
4a39ef7ed1bb: Preparing
4c5aab3548b9: Preparing
ec348085b0e6: Preparing
c2be8853e0b2: Preparing
0f1151f5fc99: Preparing
00399b079947: Preparing
c82d454eb914: Preparing
b25487d1db04: Preparing
e367fb455ccf: Preparing
bc6b4902b79e: Pushed
57df5852e66c: Layer already exists
d788ea03fce1: Layer already exists
1ffa9e6f04f1: Layer already exists
377e5b96eca6: Layer already exists
90dd0108373f: Layer already exists
eb8fe74986a4: Layer already exists
e2a005b711f9: Layer already exists
3a29b9e0627a: Layer already exists
ca4c28881d11: Layer already exists
33614d3265ba: Layer already exists
270f4d759cc3: Layer already exists
0fa80309f3d6: Layer already exists
4e1d0b4d1868: Layer already exists
910d7fd9e23e: Pushed
4230ff7f2288: Pushed
2c719774c1e1: Layer already exists
ec62f19bb3aa: Layer already exists
f94641f1fe1f: Layer already exists
latest: digest:
sha256:5d2729ae576349b158acc6c480acdde3899e2c6a9445966bb7e8d291677e11dd size: 7866
Note: The 'Layer already exists' is from a previous push I did. I had to do the push 2 times because for some layers it kept retrying and then reached EOF and stopped. So in the first push I pushed most of the layers, then in the second push the rest of the layers that could not be pushed the first time. Could the problem lie here?
The new image I want to push is completely different from the old one (they are both Flask apps).
After the above, I went to azure portal and restarted the service for this resource but nothing happened. The azure service remains the same and the new functionality hasn't been added.
I've read other posts that suggest that the problem lies on the tag names. I can't find a way around this since I want to update the existing image in the azure registry (does that mean that the tag names would be the same?).
Has anyone else encountered this problem or maybe has an idea about what I am doing wrong?

For your issue, you just need something to notify the Web App to update it. Then you need to create a Webhook for your Web App before updating your image. The description here in the next steps. For more details, see Push an updated container image to a geo-replicated container registry for regional web app deployments.

Related

Azure : Error 404: AciDeploymentFailed / Error 400 ACI Service request failed

I am trying to deploy a machine learning model through an ACI (Azure Container Instances) service. I am working in Python and I followed the following code (from the official documentation : https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=azcli) :
The entry script file is the following (score.py):
import os
import dill
import joblib
def init():
global model
# Get the path where the deployed model can be found
model_path = os.getenv('AZUREML_MODEL_DIR')
# Load existing model
model = joblib.load('model.pkl')
# Handle request to the service
def run(data):
try:
# Pick out the text property of the JSON request
# Expected JSON details {"text": "some text to evaluate"}
data = json.loads(data)
prediction = model.predict(data['text'])
return prediction
except Exception as e:
error = str(e)
return error
And the model deployment workflow is as:
from azureml.core import Workspace
# Connect to workspace
ws = Workspace(subscription_id="my-subscription-id",
resource_group="my-ressource-group-name",
workspace_name="my-workspace-name")
from azureml.core.model import Model
model = Model.register(workspace = ws,
model_path= 'model.pkl',
model_name = 'my-model',
description = 'my-description')
from azureml.core.environment import Environment
# Name environment and call requirements file
# requirements: numpy, tensorflow
myenv = Environment.from_pip_requirements(name = 'myenv', file_path = 'requirements.txt')
from azureml.core.model import InferenceConfig
# Create inference configuration
inference_config = InferenceConfig(environment=myenv, entry_script='score.py')
from azureml.core.webservice import AciWebservice #AksWebservice
# Set the virtual machine capabilities
deployment_config = AciWebservice.deploy_configuration(cpu_cores = 0.5, memory_gb = 3)
from azureml.core.model import Model
# Deploy ML model (Azure Container Instances)
service = Model.deploy(workspace=ws,
name='my-service-name',
models=[model],
inference_config=inference_config,
deployment_config=deployment_config)
service.wait_for_deployment(show_output = True)
I succeded once with the previous code. I noticed that during the deployment the Model.deploy created a container registry with a specific name (6e07ce2cc4ac4838b42d35cda8d38616).
The problem:
The API was working well and I wanted to deploy an other model from scratch. I deleted the API service and model from Azure ML Studio and the container registry from Azure ressources.
Unfortunately I am not able to deploy again anything.
Everything goes fine until the last step (the Model.deploy step), I have the following error message :
Service deployment polling reached non-successful terminal state, current service state: Unhealthy
Operation ID: 46243f9b-3833-4650-8d47-3ac54a39dc5e
More information can be found here: https://machinelearnin2812599115.blob.core.windows.net/azureml/ImageLogs/46245f8b-3833-4659-8d47-3ac54a39dc5e/build.log?sv=2019-07-07&sr=b&sig=45kgNS4sbSZrQH%2Fp29Rhxzb7qC5Nf1hJ%2BLbRDpXJolk%3D&st=2021-10-25T17%3A20%3A49Z&se=2021-10-27T01%3A24%3A49Z&sp=r
Error:
{
"code": "AciDeploymentFailed",
"statusCode": 404,
"message": "No definition exists for Environment with Name: myenv Version: Autosave_2021-10-25T17:24:43Z_b1d066bf Reason: Container > registry 6e07ce2cc4ac4838b42d35cda8d38616.azurecr.io not found. If private link is enabled in workspace, please verify ACR is part of private > link and retry..",
"details": []
}
I do not understand why the first time a new container registry was well created, but now it seems that it is sought (the message is saying that container registry identified by name 6e07ce2cc4ac4838b42d35cda8d38616 is missing). I never found where I can force the creation of a new container registry ressource in Python, neither specify a name for it in AciWebservice.deploy_configuration or Model.deploy.
Does anyone could help me moving on with this? The best solution would be I think to delete totally this 6e07ce2cc4ac4838b42d35cda8d38616 container registry but I can't find where the reference is set so Model.deploy always fall to find it.
An other solution would be to force Model.deploy to generate a new container registry, but I could find how to make that.
It's been 2 days that I am on this and I really need your help !
PS : I am not at all a DEVOPS/MLOPS guy, I make data science and good models, but infrastructure and deployment is not really my thing so please be gentle on this part ! :-)
What I tried
Creating the container registry with same name
I tried to create the container registry by hand, but this time, this is the container that cannot be created. The Python output of the Model.deploy is the following :
Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-10-25 19:25:10+02:00 Creating Container Registry if not exists.
2021-10-25 19:25:10+02:00 Registering the environment.
2021-10-25 19:25:13+02:00 Building image..
2021-10-25 19:30:45+02:00 Generating deployment configuration.
2021-10-25 19:30:46+02:00 Submitting deployment to compute.
Failed
Service deployment polling reached non-successful terminal state, current service state: Unhealthy
Operation ID: 93780de6-7662-40d8-ab9e-4e1556ef880f
Current sub-operation type not known, more logs unavailable.
Error:
{
"code": "InaccessibleImage",
"statusCode": 400,
"message": "ACI Service request failed. Reason: The image '6e07ce2cc4ac4838b42d35cda8d38616.azurecr.io/azureml/azureml_684133370d8916c87f6230d213976ca5' in container group 'my-service-name-LM4HbqzEBEi0LTXNqNOGFQ' is not accessible. Please check the image and registry credential.. Refer to https://learn.microsoft.com/azure/container-registry/container-registry-authentication#admin-account and make sure Admin user is enabled for your container registry."
}
Setting admin user enabled
I tried to follow the recommandation of the last message saying to set Admin user enabled for the container registry. All what I saw in Azure interface is that a username and password appeared when enabling on user admin.
Unfortunately the same error message appears again if I try to relaunche my code and I am stucked here...
Changing name of the environment and model
This does not produces any change. Same errors.
As you tried with first attempt it was worked. After deleting the API service and model from Azure ML Studio and the container registry from Azure resources you are not able to redeploy again.
My assumption is your first attempt you are already register the Model Environment variable. So when you try to reregister by using the same model name while deploying it will gives you the error.
Thanks # anders swanson Your solution worked for me.
If you have already registered your env, myenv, and none of the details of the your environment have changed, there is no need re-register it with myenv.register(). You can simply get the already register env using Environment.get() like so:
myenv = Environment.get(ws, name='myenv', version=11)
My Suggestion is to name your environment as new value.
"model_scoring_env". Register it once, then pass it to the InferenceConfig.
Refer here

How to solve 'GitRepository not found' error in FluxCD?

I am trying to use Azure kuberenetes cluster and FluxCD to connect to a repository named realtimeapp-infra in Gitlab. I created the source and kustomization .yaml files in another repo training-setup, but getting the following error when I use flux get kustomizations in cmd. I was getting the same error with GitHub also. (I am new to both FluxCD and Kubernetes.)
EDIT: The problem was solved. It was due to no master branch in the repository, and I did not have access to create the master branch. After the owner created it, the issue was resolved.
Did you connected to repository realtimeapp-infra as a GitRepository inside flux with username & credentials? This is a own CRD type coming with flux = kubectl get gitrepository -A

Cloud foundry "cf create-service" appends nonsense to "xsappname"

Trying to create a XSUAA service in the cloud fail because of already existing service. But actually no service exist there.
-> cf create-service xsuaa application xsuaa-authentication-newsletter -c security/xs-security.json
Creating service instance xsuaa-authentication-newsletter in org CF_Dev_DP / space Customer
as email.email#domain.com...
Service broker error: Service broker xsuaa failed with: org.springframework.cloud.servicebroker.exception.ServiceBrokerException: Application with xsappname com-fressnapf-microservices-newsletter!t36296 already exists. To create a new service instance, ensure that the xsappname specified in your application's xs-security.json file together with the selected service plan of the UAA service broker lead to a new appid. To update an existing service instance, use the update-service command instead.
FAILED
The error states that there is a service with the name "com-fressnapf-microservices-newsletter!t36296". The contents of xs-security.json are following:
{
"xsappname": "com-fressnapf-microservices-newsletter",
...
}
cf appends a weird "!t36296" at the end of the name.
All of the following deletion-attempts result in a "does not exist":
-> cf delete -f 'com-fressnapf-microservices-newsletter!t36296'
App com-fressnapf-microservices-newsletter!t36296 does not exist.
-> cf delete -f 'com-fressnapf-microservices-newsletter'
App com-fressnapf-microservices-newsletter does not exist.
-> cf delete-service -f 'com-fressnapf-microservices-newsletter!t36296'
Service com-fressnapf-microservices-newsletter!t36296 does not exist.
-> cf delete-service -f 'com-fressnapf-microservices-newsletter'
Service com-fressnapf-microservices-newsletter does not exist.
-> cf delete-service -f 'xsuaa-authentication-newsletter'
Service xsuaa-authentication-newsletter does not exist.
Clearly there exist none of the apps or services, but also it could not be created because of already existing one. I could not find any similar problem on the web. I would appreciate every help or hint you could provide.
This error indicates that there is a XSUAA service instance already created with the same name. However it could be in a different cloud foundry space / org / subaccount, which you don't have access to. Thus you can not view that instance, neither you can delete it.
I would recommend you to add a prefix / suffix to xsappname, which will make it unique. As an example, you can use a prefix of org-space- so that your xsappname looks like this - org-space-com-fressnapf-microservices-newsletter. You can try some other prefix / suffix as well, just make sure that they make the xsappname unique.
The weird thing appended to the xsappname, let's call it suffix, can be broken down into three components. First component - ! is just a delimiter added by XSUAA to get the suffix from xsappname. Secod component - t is an identifier of service plan, in your case it is tenant service plan (You may have b for broker etc.). Third and the last component - 36296 is just a running index added by XSUAA. Overall this suffix is added by XSUAA and utilized by XSUAA for some internal purposes. You can safely ignore it.

IoTEdge sometimes re-creates the container

We're running IoT edge modules. Inside our module, we update bunch of files. We noticed that most of the time, if the host is restarted, the container is restarted and the files we updated still exist.
Very few times, however, we noticed that when the host restarted that the container is re-created from the original image thus all data changes were lost.
Our understanding is that iot edge is using docker restart policy = always which should always keep the data of the container.
I would have next suggestions:
do not store important data on the container writable layer => do not rely on the restart policy
the reason of rebuilding the container could be a new version of your module image which was deployed, so the container was recreated using new image
setup your module deployment manifest (example) properly by using the module container createOptions and attach a local volume to the container (createOptions->HostConfig->Binds), and store your data there. This will survive any recreations of your module container . See example. something like:
"createOptions": {
"HostConfig": {
"Binds": [
"/app/db:/app/db"
]
}
}

Azure function upload failing with "A task was canceled"

I am getting the following error when using the following command to upload my function app:
func azure functionapp publish FuncAppName
I ran this from both the parent directory of the function app and the function app directory itself, and got the same error. It looks like some task in the upload times out after a minute or so:
Publish C:\Users\username\Documents\visual studio 2017\Projects\AzureFuncApp contents to an Azure Function App. Locally deleted files are not removed from destination.
Getting site publishing info...
Creating archive for current directory...
Uploading archive...
A task was canceled.
Any idea how to solve this/get more debugging info?
The function in question already exists on Portal and is running. I was previously able to upload it successfully.
Please refer to this GitHub issue:
https://github.com/Azure/azure-functions-cli/issues/147
A change has been made to address this issue and will be included in the next CLI release.

Resources