Deploying Model to Kubernetes - azure

I am trying to deploy a model to Kubernetes in Azure Machine Learning Studio, it was working for a while, but now, it fails during deployment, the error message is as follows:
Deploy: Failed on step WaitServiceCreating. Details: AzureML service API error.
Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance: pipeline-created-on-07-28-2020-r.
From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
You can also try to run image viennaglobal.azurecr.io/azureml/azureml_6ae744633f749472feb283065055dc2c:latest locally.
Please refer to http://aka.ms/debugimage#service-launch-fails for more information.
{
"code": "KubernetesDeploymentFailed",
"statusCode": 400,
"message": "Kubernetes Deployment failed",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance: pipeline-created-on-07-28-2020-r. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image viennaglobal.azurecr.io/azureml/azureml_6ae744633f749472feb283065055dc2c:latest locally. Please refer to http://aka.ms/debugimage#service-launch-fails for more information."
}
]
}

I know this question is about a different error, but the debugging steps should still be the same.
AML - Web service TimeoutError

It seems it was a bug, got corrected by itself today. Closing this question now

Related

Unable to Deploy Flatcar OS on Azure

I was trying to deploy flatcar image on Azure, but I am not able to deploy it. following are the steps I performed
I downloaded latest azure supported VHD from https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_azure_image.vhd.bz2.
I uploaded this VHD to azure storage blob and converted it to an image as recommended by Azure guides
I tried creating VM out of this image. VM gets created successfully, but we can see one error while creating VM and VM creation is shown as failed (Even though it is actually successful). Following is the error which I can see:
{
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
"details": [
{
"code": "VMExtensionHandlerNonTransientError",
"message": "The handler for VM extension type 'Microsoft.Azure.Diagnostics.LinuxDiagnostic' has reported terminal failure for VM extension 'LinuxDiagnostic' with error message: '[ExtensionOperationError] Non-zero exit code: 1, /var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic-3.0.141/diagnostic.py -install\n[stdout]\n\n\n[stderr]\n File \"/var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic-3.0.141/diagnostic.py\", line 54\n print 'A local import (e.g., waagent) failed. Exception: {0}\\n' \\\n ^\nSyntaxError: invalid syntax\n'.\r\n \r\n'Install handler failed for the extension. More information on troubleshooting is available at https://aka.ms/VMExtensionLinuxDiagnosticsTroubleshoot'"
}
]
}
I tried going through link provided, but it didn't help much.
I also tried another option as following
Deployed flatcar VM through Azure marketplace
Captured generalized image out of this VM
Deployed VM using the image created in above step
Even with this approach I am getting same error.
for now, waagent (Azure Linux agent) does not support python 3.x, hence this syntax error exists. You need to have python 2.x on your OS to not have this issue.

Azure Function publish - "Timed out waiting for SCM to update the Environment Settings"

I've deployed and published several Function Apps without issues over the last 12 months. However, as of this week, when publishing a Function App using the following PowerShell script:
func azure functionapp publish <functionAppName> --java
I will receive the following error after a few minutes: "Timed out waiting for SCM to update the Environment Settings"
Similarly, I'm also unable to deploy any Function Apps, using:
mvn azure-functions:deploy
In the Function App activity log, the following error is logged for both cases:
Operation name: Sync Web Apps Function Triggers.
Status: Failed.
Error code: BadRequest (HTTP Status Code: 400)
Message: Encountered an error (InternalServerError) from host runtime.
So far I've created the Application setting WEBSITE_WEBDEPLOY_USE_SCM (value: true) based on feedback in another topic, which unfortunately hasn't helped. Other than that I've not been able to find much other information on this issue.
Does anyone have any thoughts?
Resolved this issue myself. The Application Setting WEBSITE_CONTENTAZUREFILECONNECTIONSTRING contained an outdated storage account key.

Azure Kubernets deployment fail ....Error: Received bad response from Model Management Service: Response Code: 400

Azure kubernets deployment fail....
WebserviceException:
Message: Service deployment polling reached non-successful terminal state, current service state: Failed
Operation ID: cf9db31f-0466-41dd-b70f-fe5a9
More information can be found using '.get_logs()'
Error:
{
"code": "KubernetesDeploymentFailed",
"statusCode": 400,
"message": "Kubernetes Deployment failed",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance: aks-service-fa2. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image c377cabf339b45c71.azurecr.io/azureml/azureml_bd83accc12:latest locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."
},
{
"code": "DeploymentFailed",
"message": "Your container endpoint is not available. Please follow the steps to debug:\n1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
As the error clearly suggests,
Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance: aks-service-fa2. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image c377cabf339b45c71.azurecr.io/azureml/azureml_bd83accc12:latest locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.
You need to check at the application log (Inspect the Docker log) what is the error and fix it.
# if you already have the service object handy
print(service.get_logs())
# if you only know the name of the service (note there might be multiple services with the same name but different version number)
print(ws.webservices['mysvc'].get_logs())

Unable to deploy VMSS in combination to ARM deployment

I hope somebody can guide me with this issue. I do not have issues deploying resources via the web interface. This time I am trying to automatize my infrastructure and I am deploying via ARM. All the resources for the Service Fabric cluster I am trying to create are deployed with no issue, except for the VMSS which throws me this error:
{
"status": "Failed",
"error": {
"code": "LinkedAuthorizationFailed",
"message": "The client has permission to perform action 'Microsoft.KeyVault/vaults/deploy/action' on scope '/subscriptions/xxxxxx/resourcegroups/AllyStage-v2/providers/Microsoft.Compute/virtualMachineScaleSets/StageNode', however the linked subscription 'xxxxx' was not found. "
}
}
Thanks.
It would have helped to look at the ARM template that you're trying to deploy. However, I suspect the problem is that the resource ID for a resource or subscription isn't resolving correctly. Here is a similar issue from the past.
Also, if you are deploying the ARM template from within another bash/PowerShell script, I suggest you ensure that you have the correct context/subscription set before initiating the template deployment, and verify the scope of permissions of the principal performing the deployment.

Azure Websites Deployment - Cannot find user- ExtendedCode 09004

I'm trying to deploy a simple AspNetCore web application to Azure Websites using the following process:
https://docs.asp.net/en/latest/tutorials/publish-to-azure-webapp-using-vs.html
I keep getting the error message:
"ErrorEntity": {
"Code": "NotFound",
"Message": "Cannot find user.",
"ExtendedCode": "09004",
"MessageTemplate": "Cannot find user.",
"Parameters": [],
"InnerErrors": null
}
I have tried this from a new project and existing project and both give the same error.
Googling for the issue turns up the following:
https://social.msdn.microsoft.com/Forums/en-US/7743aca4-1a88-4ef5-ab74-98992f2bbf22/cannot-find-user-error-when-creating-new-app-service-plan?forum=windowsazurewebsitespreview
However I haven't been able to get a solution so far.
Anyone else had this issue or managed to find an answer?
Thanks
Microsoft stated it was an intermittent issue. There is a workaround they provided but it's not longer an issue for me so I can't test it. Might help someone so I've provided it below anyway for the sake of completeness. If like me you no longer have an issue I suggest this question is closed.
Go to portal.azure.com -> Browse -> App Services -> (open any web app) -> Settings -> Deployment Credentials
Enter in a new password and save (please ignore the validation error after saving)
After the update succeeds, try creating a new App Service Plan, or Web App with a new App Service Plan.

Resources