Versioning the infrastructure Continuous Deployment - azure

We are trying to establish continuous deployment. We want to do network swaps when deploying new resources.
Sometimes the Azure infrastructure is not responsive provisioning the resources. As we can have a left over conflicting resources when a deployment fails due to unsuccessful deployments.
Question : Should we name our resources uniquely every time we deploy ? Such as giving our build number naming to our infrastructure.
PS:I have noticed that azure team already does this. As they have to host a multi tenant architecture.

Arm is designed to be idempotent. That means you should be able to redeploy any template any time. When doing infrastructure as code you probably provision all resources in a resource group at the same time. Do the deployments in complete mode, then resources that no longer are linked in the templates are removed. Here is are article on versioning ARM templates that can help you:
versioning-arm-template-deployments

Related

Is there a maximum limit on Workspaces in a Terraform Cloud Organization?

I am building a small server-less application on aws. It is a SaaS for business purposes so I am looking at ways to cater for multi-tenancy.
So far my proof-of-concepts have been single tenant and deployed via terraform.
I am thinking of using the Terraform Cloud Workspace API to create a workspace for each tenant on sign up. The work spaces would be configured to auto-apply from my production github branch.
I'm concerned that this isn't the intended usage of Terraform Cloud and that I may run into issues as the application scales.
Does anyone have any insight into the upper-limits of Terraform Cloud? I have read through some of Hashicorp's documentation but I can't find anything specific to this.

What features in Azure services cannot be scripted in Terraform or require embedding ARM in Terraform?

When working with Terraform, what features of Azure services are there that cannot be scripted in Terraform or require embedding ARM?
Currently, there is no resource to create Data Sync Group in Azure using Terraform
An ARM template configures the Azure PaaS resources to send their diagnostic data to Log Analytics. There is no functionality for this in Terraform when used with Azure
There is a zone to zone disaster recovery for Azure VM but terraform only provides single instance and target availability set in the azure site recovery
Almost all the new features added in Azure cannot be created using Terraform
The landscape for both Azure and Terraform is constantly changing, so it would not make much sense to list what is supported/not supported in a Stack Overflow context.
I have been working with Terraform in Azure for more than 5 years, and the AzureRM provider is being updated almost on a biweekly basis. In general, it is very much up to date - not only with new resources and data sources that are being added constantly but also updates on existing components functionality and when the Azure API changes. This provider rocks!
Take a look at the changelog here to get an overview of the intense activity on the AzureRM provider: https://github.com/hashicorp/terraform-provider-azurerm/blob/main/CHANGELOG.md
I believe that instead of asking what is not supported, take a look at the landscape you want to create, and see if the components exist in the documentation, which is very good IMO. I think that the latest AzureRM provider (2.91.0) has around 950+ resources and data sources.
Documentation: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs
Usually, when something does not exist there will be an issue in the Github repo. E.g. to follow the example that #RamaroAdapa-MT mentions, you can find the issue here (by a simple Google search):
https://github.com/hashicorp/terraform-provider-azurerm/issues/6425
Looking through that, you can see the the reason is actually not a Terraform AzureRM provider issue, but lack of support in the Azure API:
https://github.com/Azure/azure-rest-api-specs/issues/11061

Complete mode deployment in Azure DevOps for ADF

I tried to deploy ARM template for azure Data Factory as part of DevOps implementation.
Mode of deployment was selected as complete in oreder to cleanup the existing adf instance and deploy only the pipelines that are available in ARM template. as shown below.
When I tried to run the deployment, It failed with an error as
##[error]The deployment failed because some resources could not be deleted. This might be due to not having permissions to delete resources in the targeted scope. Please see https://aka.ms/arm-debug for usage details.
2020-11-02T05:33:34.5795133Z ##[error]Check out the troubleshooting guide to see if your issue is addressed: https://learn.microsoft.com/en-us/azure/devops/pipelines/tasks/deploy/azure-resource-group-deployment?view=azure-devops#troubleshooting
When I did a debug on this issue, i could understand that the deployment scope is selected as Resource Group and deployment task tried to delete all the resources under this resource group and failed because it couldnt delete resources other than ADF instance because of access issues.
Since I do not have access to other resources, they were not deleted, otherwise i could have messed up everything by deleting all other resources like ADLS, databricks, sql....
Since I am deploying ADF ARM template is there any way to restrict the deployment scope only for ADF instance, which will not affect other resources.
Any leads appreciated!
I have a query about the What-If feature provided by azure for ARM template deployment. Can we use this in our Release pipeline as a powershell task?
You cannot restrict the deployment scope lower than the ResourceGroup as you noticed.
The only way for you to do this would be either putting the ADF in a seperate resource group but i asume that is not possible.
A second way of doing this would be deleting the ADF through the portal or powershell and then do an incremental deployment of you ARM template that has only the ADF definition in it.
Microsoft rolled out a new feature for ARM deployments called What-If. This is a super nice feature to check what changes to what resources will happen when deploying your template. Note, it only works with powershell core at the moment. If you work with ARM templates this could help you and catch resource deletion before you deploy anything.
When deploying a data factory, do not select Complete as your deployment mode. This will overwrite all resources in the resource group including non-ADF entities.
Selecting Incremental will deploy only the resources located in the ARM template. If generated from the adf_publish branch, then this will contain all of the resources in your factory.

Extra Resources Created In Azure For VM

When I create a VM in Azure, it is creating an accompanying Cloud Service and Network Resource. I found that the Cloud Service is there as a deployment layer. I have not found why the Network Interface is there.
Since this particular circumstance is not going to have a deployment associated with it as it is used as an Elasticsearch server, I technically will not be needing the Cloud Service. However, when I delete the service, it takes the VM with it even though I do not expressly select it for delete.
My two specific questions:
1st - Why is there a Cloud Service created and not able to be deleted without repercussions when there is not deployment necessary?
2nd - Why is the Network Interface created and not able to be deleted without repercussions?
Both questions are with the understanding that this is an Elasticsearch VM.
A cloud service is a required artefact of an ASM/classic deployment if a VM. It is not needed in an Azure Resource Manager deployment, which is what you should use for new deployments. However, the two types of deployment are orthogonal, so you may need to keep using ASM if you already have VMs deployed that way. If so, you should consider migrating them to ARM.

Azure Cloud Service Deployment

I am having an issue deploying to the Staging environment of my Windows Azure Cloud Service.
This is something I do frequently without issue before doing a swap to Production (once I have validated everything is OK in Staging). Today for some reason I am getting this error when trying to deploy:
Allocation failed; unable to satisfy constraints in request. The requested new service deployment is bound to an Affinity Group, or it targets a Virtual Network, or there is an existing deployment under this hosted service. Any of these conditions constrains the new deployment to specific Azure resources. Please retry later or try reducing the VM size or number of role instances. Alternatively, if possible, remove the aforementioned constraints or try deploying to a different region. The long running operation tracking ID was: da5cc14aaba6228683cb4e8888b835e1.
Seeing as my deployment package has not changed since the last time I successfully performed an update of my Staging environment (apart from one line of code for a bug fix) I can't see this being an issue with my package. I am hoping this is transient Azure environment issue - anyone any ideas as to what this may be?
There is a fragmentation issue in the cluster you are trying to deploy to. The ops team is engaged and working to resolve and you should be able to deploy again later tonight or tomorrow.
Some additional information:
Once you create a deployment (either prod or staging slot) in a cloud service your entire cloud service (both prod and staging slots) are pinned to a cluster of machines (there are some Mark Russinovich fabric videos with more details if you are interested). So if there is a problem in a cluster, or you try to deploy a VM size not available in the cluster such as the new D series machines, then you may fail if the specific cluster can't allocate the request. To resolve this you can deploy to a brand new cloud service which will allow the fabric to check all clusters in that datacenter/region to satisfy the allocation request.
Consider a different upgrade strategy for scenarios like this. A lot of services will upgrade by creating a new deployment in a new cloud service, thus getting a new URL and IP address, and then modify the CNAME or A Record in order to transition clients to the new service.
If you see this issue again you can usually get a fast resolution by opening a support incident - http://azure.microsoft.com/en-us/support/options/
Update: We have a new blog post that describes this scenario and the common causes - http://azure.microsoft.com/blog/2015/03/19/allocation-failure-and-remediation/.
We had a similar allocation failed issue recently deploying our Azure Cloud Service:
Azure Cloud Service Deployment Error
Allocation failed; unable to satisfy constraints in request. The requested new service deployment is bound to an Affinity Group, or it targets a Virtual Network, or there is an existing deployment under this hosted service. Any of these conditions constrains the new deployment to specific Azure resources. Please retry later or try reducing the VM size or number of role instances. Alternatively, if possible, remove the aforementioned constraints or try deploying to a different region.
Allocation Failed - Resolution
Delete Existing Cloud Service
Create New Cloud Service target different Data Center or Resource Group (upload SSL certs required)
Redeploy cloud service package
Relink VSO Team Projects
I suspect the issue has something to do with a corrupt resource group or recent Azure upgrades that were not backwards compatible with older resource groups.

Resources