Azure Service Fabric VMExtensionProvisioningError - azure

I am trying to create secure service fabric cluster from Azure Portal with primary certificate uploaded to key-vault.
All required resources are being created. Exception occurs on Virtual machine scale set, operation Write VirtualMachineScaleSets reports:
"properties": {
"statusCode": "Conflict",
"statusMessage": "{\"status\":\"Failed\",\"error\":{\"code\":\"ResourceDeploymentFailure\",\"message\":\"The resource operation completed with terminal provisioning state 'Failed'.\",\"details\":[{\"code\":\"VMExtensionProvisioningError\",\"message\":\"VM has reported a failure when processing extension 'VMDiagnosticsVmExt_vmNodeType0Name'. Error message: \\\"Monitoring Agent not reporting success after launch\\\".\"}]}}"
}
Because internal operation Write Deployments is being failed, and no detailed message.

Related

Azure Kubernets deployment fail ....Error: Received bad response from Model Management Service: Response Code: 400

Azure kubernets deployment fail....
WebserviceException:
Message: Service deployment polling reached non-successful terminal state, current service state: Failed
Operation ID: cf9db31f-0466-41dd-b70f-fe5a9
More information can be found using '.get_logs()'
Error:
{
"code": "KubernetesDeploymentFailed",
"statusCode": 400,
"message": "Kubernetes Deployment failed",
"details": [
{
"code": "CrashLoopBackOff",
"message": "Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance: aks-service-fa2. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image c377cabf339b45c71.azurecr.io/azureml/azureml_bd83accc12:latest locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information."
},
{
"code": "DeploymentFailed",
"message": "Your container endpoint is not available. Please follow the steps to debug:\n1. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs.
As the error clearly suggests,
Your container application crashed. This may be caused by errors in your scoring file's init() function.\nPlease check the logs for your container instance: aks-service-fa2. From the AML SDK, you can run print(service.get_logs()) if you have service object to fetch the logs. \nYou can also try to run image c377cabf339b45c71.azurecr.io/azureml/azureml_bd83accc12:latest locally. Please refer to https://aka.ms/debugimage#service-launch-fails for more information.
You need to check at the application log (Inspect the Docker log) what is the error and fix it.
# if you already have the service object handy
print(service.get_logs())
# if you only know the name of the service (note there might be multiple services with the same name but different version number)
print(ws.webservices['mysvc'].get_logs())

Azure Change Tracking/Inventory (Configuration Management) cannot be enabled

We have an several VMs connected to the Log Analytics workspace and the automation is linked to it. The Update Management is enabled on all VMs and it's working properly.
When trying to enable either the Change Tracking or Inventory in the Configuration Management, it's showing "Cannot enable" status. As far as I know both the Update Management and Configuration Management use the same Agent, so it shouldn't be a problem.
Did I miss something here? If you have any Idea of what is the reason, please share it with me.
Here is the error of the deployment:
OPERATION ID *****
TRACKING ID *****
STATUS BadRequest
STATUS MESSAGE {
"error": {
"code": "BadRequest",
"message": ""
}
}
PROVISIONING STATE Failed
TIMESTAMP 11.6.2019, 14:11:42
DURATION 1 second
TYPE Microsoft.OperationalInsights/workspaces/configurationScopes
RESOURCE ID *******/MicrosoftDefaultScopeConfig-ChangeTracking
RESOURCE som-workspace/MicrosoftDefaultScopeConfig-ChangeTrac

How to update queue/topic of Azure Service Bus via ARM?

I have a ARM (Azure Resource Manager) script that creates Service bus with topic and subscriber inside. It worked perfectly for some time, but I decided to enable session on topic and disable partitioning. Script was changed and during deployment it gives me:
Template deployment returned the following errors:
07:56:00 - Resource Microsoft.ServiceBus/namespaces/topics 'ops-ServiceBus/default-topic' failed with message '{
"error": {
"message": "SubCode=40000. Partitioning cannot be changed for Topic. . TrackingId:<some_guid>_M11CH3_M11CH3_G1, SystemTracker:ops-servicebus.servicebus.windows.net:default-topic, Timestamp:2019-03-28T04:55:56 CorrelationId: <some_guid>",
"code": "BadRequest"
}
}'
07:56:21 - Template output evaluation skipped: at least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details.
Is it possible to perform update operation on Queue/Topic using ARM?
We did configure queues\topics with arm templates, but according to the error - some parameters are immutable, so you'd have to recreate in this case.

Azure Service Fabric Resource Deployment Failure Exit Code -532462766

After creating a vanilla service fabric cluster through the Azure portal, I am getting this error on the VM Scale Set. It has happened using a range of different cluster names, sizes and VM types.
Full error:
{
"status": "Failed",
"error": {
"code": "ResourceDeploymentFailure",
"message": "The resource operation completed with terminal provisioning state 'Failed'.",
"details": [
{
"code": "VMExtensionHandlerNonTransientError",
"message": "Handler 'Microsoft.Azure.ServiceFabric.ServiceFabricNode' has reported failure for VM Extension 'Test_ServiceFabricNode' with terminal error code '1009' and error message: 'Enable failed for plugin (name: Microsoft.Azure.ServiceFabric.ServiceFabricNode, version 1.0.0.33) with exception Command C:\\Packages\\Plugins\\Microsoft.Azure.ServiceFabric.ServiceFabricNode\\1.0.0.33 \\ServiceFabricExtensionHandler.exe of Microsoft.Azure.ServiceFabric.ServiceFabricNode has exited with Exit code: -532462766'"
}
]
}
}
This is before I am trying to publish my user-code to the cluster, so not really sure what can be causing this. Using default settings for most things. Initial capacity 3 (test cluster). Bronze Durability/Reliability
EDIT:
Seeing this issue when connecting to one of the VMs in remote desktop.
Application: ServiceFabricExtensionHandler.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: Microsoft.Azure.ServiceFabric.Extension.Core.AgentException
at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadClientCertificate(System.String, System.String)
at Microsoft.Azure.ServiceFabric.Extension.Core.CertificateUtility.LoadCertificateWrapper(Microsoft.Azure.ServiceFabric.Extension.Core.Models.CertificateSettings)
at Microsoft.Azure.ServiceFabric.Extension.Core.Models.HandlerSettings.AllowAccessToCerts()
at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.ValidateDeployment(Microsoft.Azure.ServiceFabric.Extension.Core.Models.HandlerSettings)
at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallService()
at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.InstallServiceWithRetry()
at Microsoft.Azure.ServiceFabric.Extension.Core.VMExtensionHandler.Enable()
at Microsoft.Azure.ServiceFabric.Extension.Handler.Program.Main(System.String[])
The issue is that Service Fabric can't find the certificate you have configured. There are a couple common reasons for this:
The wrong certificate thumbprint is provided. Assuming you are using KeyVault, make sure you aren't using the ID portion of the KeyVault URL as your cert thumbprint.
There is a hidden unicode character at the beginning of your cert thumbprint. This is common if you got your thumbprint from Windows cert viewer dialog as when you copy/paste from that dialog it inserts a hidden character at the beginning. Resolution is to first copy to notepad (or any other ASCII editor).
There are two ways you can troulbeshoot this if you RDP to one of the VMs:
Check the Azure guest agent logs at C:\WindowsAzure\Logs\WaAppAgent.logs. You will be looking for entries relating to installing the Microsoft.Azure.ServiceFabric.Extension, and then something like "Cannot find certificate with thumbprint "xxxx" ...". Make sure the thumbprint is what you expect, and that the thumbprint doesn't start with a question mark "?xxx" which indicates the hidden unicode character.
Open the Service Fabric Admin event logs (eventvwr -> Applications and Services Logs -> Microsoft-ServiceFabric). You should see error entries related to failing to load a certificate.

Azure VM Resource Deployment Failed: "The system is not authoritative for the specified account"

I have been using an Azure VM for several weeks: (Windows 10, Visual Studio Developer VM), But have been unable to login for several hours.
The machine is reported as running, RDP finds the machine and presents the login box, but Login fails: (Your credentials did not work)
The VM can be restarted, but the same error occurs.
Boot diagnostics shows the Windows 10 'beach cave' image
Attempts to reset the password give errors in the event log:
Failed to reset password At lease one resource deployment operation
failed. Please list deployment operations for details. see
https://aka.ms/arm-debug for usage details.
Then Deployment operations has this error:
Deployment failed Deployment to resource group 'MY_AZURE_GROUP'
failed. Additional details from the underlying API that may be
helpful. At least one deployment operation failed. Please list
deployment operations for details.
Then this error expands to:
Status: Conflict
Provisioning State: Failed
Type: Microsoft.Compute/virtualMachines/extensions
StatusMessage:
{
"status": "Failed",
"error": {
"code": "ResourceDeploymentFailure",
"message": "The resource operation completed with terminal provisioning state 'Failed'.",
"details": [
{
"code": "VMExtensionProvisioningError",
"message": "VM has reported a failure when processing extension 'enablevmaccess'. Error message: \"Cannot update Remote Desktop Connection settings for built-in Administrator account. Error: The system is not authoritative for the specified account and therefore cannot complete the operation. Please retry the operation using the provider associated with this account. If this is an online provider please use the provider's online site.\r\n\"."
}
]
}
}
So I then tried Redeploying the VM: Which gave this error
Failed to redeploy the virtual machine 'MY_AZURE_VM'. Error: VM has reported a failure when processing extension 'enablevmaccess'. Error message: "Cannot update Remote Desktop Connection settings for built-in Administrator account. Error: The system is not authoritative for the specified account and therefore cannot complete the operation. Please retry the operation using the provider associated with this account. If this is an online provider please use the provider's online site.
The message "The system is not authoritative for the specified account" hints at some permissions failure somewhere.
What does this mean - and how can I fix it?
Turns out the answer was not obvious and is still a little perplexing.
On first use Cortana had asked for a Microsoft account - so I had supplied details of one I rarely use (Lets call it rarely.used#domain.com) In the background Windows had changed my MY_AZURE_VM\MyLogin (my only login - and the admin user on that VM) to the Microsoft account rarely.used#domain.com!
So now I login with that Microsoft account - and all is well.
If I look in the Computer Management Users - MyLogin still exists - as the only user on the system - but If I try adding it to a Group, Check-Names converts it to rarely.used#domain.com

Resources