Unable to Deploy Flatcar OS on Azure - azure

I was trying to deploy flatcar image on Azure, but I am not able to deploy it. following are the steps I performed
I downloaded latest azure supported VHD from https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_azure_image.vhd.bz2.
I uploaded this VHD to azure storage blob and converted it to an image as recommended by Azure guides
I tried creating VM out of this image. VM gets created successfully, but we can see one error while creating VM and VM creation is shown as failed (Even though it is actually successful). Following is the error which I can see:
{
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
"details": [
{
"code": "VMExtensionHandlerNonTransientError",
"message": "The handler for VM extension type 'Microsoft.Azure.Diagnostics.LinuxDiagnostic' has reported terminal failure for VM extension 'LinuxDiagnostic' with error message: '[ExtensionOperationError] Non-zero exit code: 1, /var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic-3.0.141/diagnostic.py -install\n[stdout]\n\n\n[stderr]\n File \"/var/lib/waagent/Microsoft.Azure.Diagnostics.LinuxDiagnostic-3.0.141/diagnostic.py\", line 54\n print 'A local import (e.g., waagent) failed. Exception: {0}\\n' \\\n ^\nSyntaxError: invalid syntax\n'.\r\n \r\n'Install handler failed for the extension. More information on troubleshooting is available at https://aka.ms/VMExtensionLinuxDiagnosticsTroubleshoot'"
}
]
}
I tried going through link provided, but it didn't help much.
I also tried another option as following
Deployed flatcar VM through Azure marketplace
Captured generalized image out of this VM
Deployed VM using the image created in above step
Even with this approach I am getting same error.

for now, waagent (Azure Linux agent) does not support python 3.x, hence this syntax error exists. You need to have python 2.x on your OS to not have this issue.

Related

Deploy image built with buildah to Azure container instance from Azure container registry

So I have built an image using Buildah and have pushed it up to the ACR (azure container registry) but any method I try, AZ cli, portal, terraform, the deployment to ACI (azure container instance) fails after 30 minutes due to a timeout. The ACI is created successfully, the image can be pushed and pulled successfully from the ACR, and the image runs locally using podman. The ACI hangs trying to create the container from the image.
Error displayed
Deployment to resource group '<my-resource-group>' failed.
Additional details from the underlying API that might be helpful: At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details
Raw Error
{
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
"details": [
{
"message": "Subscription deployment didn't reach a successful provisioning state after '00:30:00'."
}
]
}
Any suggestions as to what could be the issue?
The resolution
Even though the documentation states that ACR and ACI can utilize OCI images, it would seem that ACI still requires images to use the docker format. When using buildah to create an image you need to use the --format docker flag -> buildah bud --format docker in order for the image to pull from ACR and deploy into an ACI.

Azure - Enable Backup on VM with Windows Server 2019 Core server, D4s_v3 sku, is failing with code BMSUserErrorContainerObjectNotFound

Azure VM Details :
OS : Windows Server 2019 Datacenter Core
Size: Standard D4s v3 (4 vcpus, 16 GiB memory)
Location: Australia East
VM generation: V1
Agent status: Ready
Agent version: 2.7.41491.1010
Azure disk encryption: Not Enabled
Extensions already installed :
DependencyAgentWindows
IaaSAntimalware
MDE.Windows
MicrosoftMonitoringAgent
Have an existing recovery services vault with 10s of other VMs getting backed up.
Trying to enable the backup from Azure Portal for this VM ( From the VM Blade > Operations > Backup ) but it's failing with the following error code:
I have tried it multiple times.
Provisioning state: Failed
Duration: 1 minute 3 seconds
Status: Conflict
{
"code": "DeploymentFailed",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",
"details": [
{
"code": "BMSUserErrorContainerObjectNotFound",
"message": "Item not found"
}
]
}
All the information on troubleshooting backup relates issues # https://learn.microsoft.com/en-us/azure/backup/backup-azure-vms-troubleshoot talk about After the "Enable Backup" step.
I have also tried to enable the backup using azure cli:
az backup protection enable-for-vm --vm "/subscriptions/xxx/resourceGroups/yyy/providers/Microsoft.Compute/virtualMachines/vm_name" -v vaultname -g vault_resourcegroup -p backuppolicy_name
It throws the following error:
The specified Azure Virtual Machine Not Found. Possible causes are
1. VM does not exist
2. The VM name or the Service name needs to be case sensitive
3. VM is already Protected with same or other Vault.
Please Unprotect VM first and then try to protect it again.
Please contact Microsoft for further assistance.
None of the Point 1,2 or 3 are true.
VM exists, the name is used as shown in the portal, no other VM protection service is in use.
Note: I have faced this issue a few days back on another subscription, but luckily no one was yet using that VM, so I destroyed and re-deployed the VM, and the error went away.
I can't do the same for this VM as it's already in use.
Any help/guidance will be appreciated.
Seems like a portal error or the VM is not able to communicate with Azure Platform. I would suggest you try the "Reapply" feature to update the platform status.
[Snippet of Reapply in Azure Porta][1]
Else, you can try initiating a backup from the "Recovery Services vaults" blade and add the VM to it.
The solution was to contact Microsoft support. Their engineer after some analysis ( aka to and fro, screenshots exchange over email..etc) replied with:
I check from the backend and notice that the VM status is not in synchronize state. I’ve requested the VM engineer xxxxx resync the VM from the backend. Please try to reenable the VM backup again in the Azure portal recovery service Vault page. If you encounter the same issue, please try to configure the VM backup in the Azure Virtual Machine Panel page and let me know the results. Thanks!
After this when I attempted to enable the backup it worked.
So for anyone who faces this problem, it looks like the only option is to get in touch with MS Support.

Backup Windows server Azure VM new Azure Recovery Service Vault error code BMSUserErrorContainerObjectNotFound

I have a new vm, Operating system Windows (Windows Server 2016 Datacenter).
When I try to enable backup and select new Recovery Service Vault, I get deployment error:
Deployment to resource group test failed.
Additional details from the underlying API that might be helpful: At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.
Resource
vault242/Azure/iaasvmcontainer;iaasvmcontainerv2;test;web01/vm;iaasvmcontainerv2;test;web01
Type
Microsoft.RecoveryServices/vaults/backupFabrics/protectionContainers/protectedItems
Status
Conflict
Status message
{
"status": "Failed",
"error": {
"code": "BMSUserErrorContainerObjectNotFound",
"message": "Item not found"
}
}
Can't find any information for code BMSUserErrorContainerObjectNotFound and why a protected item not created automatically
My apologies for the delay in the response.
Were you able to resolve the issue?
If not, let's review it.
As I understood, you are enabling the Azure VM Back Up by following the next steps:
There could be multiple reasons why you are getting this failure.
Did you perform these steps manually using the Azure Portal? Template deployment? Scripting? I suspect most likely you are doing the template deployment or any kind of scripting and this one is the syntax issue.
Second thought, it was the transmitted issue due to the load of request on the Azure end. In this case, you need to retry the operation.
Additional question to ask, do you get the failure on one specific machine or all machines? Specific region?
Do you get the same failure when you use the existing vault?
If you still can provide information above, it's going to be helpful to narrow down the root cause.
I ran into this error as well today and I think it is is a Azure portal bug when enabling the Backup from the VM blade.
Instead, you can initiate a Backup from the "Recovery Services vaults" blade and add the VM to it.

Deploying Model to Kubernetes

I am trying to deploy a model to Kubernetes in Azure Machine Learning Studio, it was working for a while, but now, it fails during deployment, the error message is as follows:
Deploy: Failed on step WaitServiceCreating. Details: AzureML service API error.
Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance: pipeline-created-on-07-28-2020-r.
From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
You can also try to run image viennaglobal.azurecr.io/azureml/azureml_6ae744633f749472feb283065055dc2c:latest locally.
Please refer to http://aka.ms/debugimage#service-launch-fails for more information.
{
"code": "KubernetesDeploymentFailed",
"statusCode": 400,
"message": "Kubernetes Deployment failed",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.
Please check the logs for your container instance: pipeline-created-on-07-28-2020-r. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image viennaglobal.azurecr.io/azureml/azureml_6ae744633f749472feb283065055dc2c:latest locally. Please refer to http://aka.ms/debugimage#service-launch-fails for more information."
}
]
}
I know this question is about a different error, but the debugging steps should still be the same.
AML - Web service TimeoutError
It seems it was a bug, got corrected by itself today. Closing this question now

Azure pipeline 'WinRMCustomScriptExtension' underlying connection was closed in non-public VM

In Azure pipeline when creating a VM through deployment template, we have the option to 'Configure with WinRM agent' as given below.
This acts as a custom extension behind the scenes. But the downloading of this custom extension can be blocked by an internal vnet in Azure. This is the error we are getting.
<datetime> Adding extension 'WinRMCustomScriptExtension' on virtual machine <vmname>
<datetime> Failed to add the extension to the vm: <vmname>. Error: "VM has reported a failure when processing extension 'WinRMCustomScriptExtension'. Error message: \"Failed to download all specified files. Exiting. Error Message: The underlying connection was closed: An unexpected error occurred on a send.\"\r\n\r\nMore information on troubleshooting is available at https://aka.ms/VMExtensionCSEWindowsTroubleshoot "
Since the files cannot be downloaded, I am thinking of a couple of solutions:
How can I know which powershell files azure is using to setup winrm?
Location to store files would be storage account (same vnet as VM)
Perhaps not use WinRM at all and use custom script extension to resolve
everything (with all files from storage account). I hope error from extension stops the pipeline if it happens.
Is there a better solution to resolve this? To me it looks like a bad design by azure as it is not covering non-public VMs.
EDIT:
Found answer to #1) https://aka.ms/vstsconfigurewinrm. This was shown in Raw logs of the pipeline when diagnostics were enabled
Even if you know - how does it help you? It won't be able to download them anyway and you cant really tell it to use local files
If you enable service endpoins and allow your subnet to talk to the storage account - it should work
there is a way to configure WinRM when you create the VM. Keyvault example
You could use script extension like you wanted to as well, but script extension has to download stuff to the Vm as well. Example

Resources