Layered deployments with Terraform - terraform

I am new to Terraform so not even sure something like this is possible. As an example, lets say I have a template that deploys an Azure resource group and a key vault in it. And then lets say I have another template that deploys a virtual machine into the same resource group. Is it possible to do a destroy with the virtual machine template without destroying the key vault and resource group? We are trying to compartmentalize the parts of a large solution without having to put it all in a single template and we want to be able to manage each piece separately without affecting other pieces.
On a related note...we are storing state files in an Azure storage account. If we breakup our deployment into multiple compartmentalized deployments...should each deployment have its own state file or should they all use the same state file?

For larger systems it is common to split infrastructure across multiple separate configurations and apply each of them separately. This is a separate idea from (and complimentary to) using shared modules: modules allow a number of different configurations to have their own separate "copy" of a particular set of infrastructure, while the patterns described below allow an object managed by one configuration to be passed by reference to another.
If some configurations will depend on the results of other configurations, it's necessary to store these results in some data store that can be written to by its producer and read from by its consumer. In an environment where the Terraform state is stored remotely and readable broadly, the terraform_remote_state data source is a common way to get started:
data "terraform_remote_state" "resource_group" {
# The settings here should match the "backend" settings in the
# configuration that manages the network resources.
backend = "s3"
config {
bucket = "mycompany-terraform-states"
region = "us-east-1"
key = "azure-resource-group/terraform.tfstate"
}
}
resource "azurerm_virtual_machine" "example" {
resource_group_name = "${data.terraform_remote_state.resource_group.resource_group_name}"
# ... etc ...
}
The resource_group_name attribute exported by the terraform_remote_state data source in this example assumes that a value of that name was exposed by the configuration that manages the resource group using an output.
This decouples the two configurations so that they have an entirely separate lifecycle. You first terraform apply in the configuration that creates the resource group, and then terraform apply in the configuration that contains the terraform_remote_state data resource shown above. You can then apply that latter configuration as many times as you like without risk to the shared resource group or key vault.
While the terraform_remote_state data source is quick to get started with for any organization already using remote state (which is recommended), some organizations prefer to decouple configurations further by introducing an intermediate data store like Consul, which then allows data to be passed between configurations more explicitly.
To do this, the "producing" configuration (the one that manages your resource group) publishes the necessary information about what it created into Consul at a well-known location, using the consul_key_prefix resource:
resource "consul_key_prefix" "example" {
path_prefix = "shared/resource_group/"
subkeys = {
name = "${azurerm_resource_group.example.name}"
id = "${azurerm_resource_group.example.id}"
}
resource "consul_key_prefix" "example" {
path_prefix = "shared/key_vault/"
subkeys = {
name = "${azurerm_key_vault.example.name}"
id = "${azurerm_key_vault.example.id}"
uri = "${azurerm_key_vault.example.uri}"
}
}
The separate configuration(s) that use the centrally-managed resource group and key vault would then read it using the consul_keys data source:
data "consul_keys" "example" {
key {
name = "resource_group_name"
path = "shared/resource_group/name"
}
key {
name = "key_vault_name"
path = "shared/key_vault/name"
}
key {
name = "key_vault_uri"
path = "shared/key_vault/uri"
}
}
resource "azurerm_virtual_machine" "example" {
resource_group_name = "${data.consul_keys.example.var.resource_group_name}"
# ... etc ...
}
In return for the additional complexity of running another service to store these intermediate values, the two configurations now know nothing about each other apart from the agreed-upon naming scheme for keys within Consul, which gives flexibility if, for example, in future you decide to refactor these Terraform configurations so that the key vault has its own separate configuration too. Using a generic data store like Consul also potentially makes this data available to the applications themselves, e.g. via consul-template.
Consul is just one example of a data store that happens to already be well-supported in Terraform. It's also possible to achieve similar results using any other data store that Terraform can both read and write. For example, you could even store values in TXT records in a DNS zone and use the DNS provider to read, as an "outside the box" solution that avoids running an additional service.
As usual, there is a tradeoff to be made here between simplicity (with "everything in one configuration" being the simplest possible) and flexibility (with a separate configuration store), so you'll need to evaluate which of these approaches is the best fit for your situation.
As some additional context: I've documented a pattern I used successfully for a moderate-complexity system. In that case we used a mixture of Consul and DNS to create an "environment" abstraction that allowed us to deploy the same applications separately for a staging environment, production, etc. The exact technologies used are less important than the pattern, though. That approach won't apply exactly to all other situations, but hopefully there are some ideas in there to help others think about how to best make use of Terraform in their environment.

You can destroy specific resources using terraform destroy -target path.to.resource. Docs
Different parts of a large solution can be split up into modules, these modules do not even have to be part of the same codebase and can be referenced remotely. Depending on your solution you may want to break up your deployments into modules and reference them from a "master" state file that contains everything.

Related

Terraform AzureRM Continually Modifying API Management with Proxy Configuration for Default Endpoint

We are terraforming our Azure API Management instance.
...
resource "azurerm_api_management" "apim" {
name = "the-apim"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
...
hostname_configuration {
proxy {
host_name = "the-apim.azure-api.net"
negotiate_client_certificate = true
}
}
}
...
We need to include the hostname_configuration block so that we can switch negotiate_client_certificate to true for the default endpoint.
This does the job, however every time Terraform runs it plans to modify the APIM instance by adding the hostname_configuration block again:
+ hostname_configuration {
+ proxy {
+ host_name = "the-apim.azure-api.net"
+ negotiate_client_certificate = true
}
}
Is there a way to prevent this from happening? In the portal I can see this value is set to true.
I suggest you try to pair with lifecycle > ignore_changes.
The ignore_changes feature is intended to be used when a resource is created with references to data that may change in the future, but should not affect said resource after its creation. In some rare cases, settings of a remote object are modified by processes outside of Terraform, which Terraform would then attempt to "fix" on the next run. In order to make Terraform share management responsibilities of a single object with a separate process, the ignore_changes meta-argument specifies resource attributes that Terraform should ignore when planning updates to the associated remote object.
In your case, the hostname_configuration is considered a "nested block" or "attribute as block" in Terraform. So the usage of ignore_changes is not so straightforward (you can't just add the property name, as you would do if you wanted to ignore changes in your resource_group_name for example, which is directly a property). From an issue in GitHub back from 2018, it seems you could use the TypeSet hash of the nested block to add to an ignore sections.
Even though I can't test this, my suggestion for you:
deploy your azurerm_api_management resource normally with the hostname_configuration block
check the state file from your resource and get the typeset hash of the hostname_configuration part; should be similar to hostname_configuration.XXXXXX
add an ignore_changes section passing the above
resource "azurerm_api_management" "apim" {
# ...
lifecycle {
ignore_changes = [
"hostname_configuration.XXXXXX",
]
}
}
Sometimes such issues occur due to issues in the provider. Probably it is not storing the configuration in the state file or not retrieving the stored state for this block. Try upgrading the provider to the latest available provider and see if it sorts the issue.
If that does not solve it, you can try defining this configuration as a separate resource. As per the terraform documentation: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/api_management
It's possible to define Custom Domains both within the
azurerm_api_management resource via the hostname_configurations block
and by using the azurerm_api_management_custom_domain resource.
However it's not possible to use both methods to manage Custom Domains
within an API Management Service, since there'll be conflicts.
So Please try removing that hostname_configuration block and add it as separate resource as per this documentation: https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/api_management_custom_domain
This will most likely fix the issue.

Terraform Import - Is the Resource Label critical?

Our terraform remote state file in Azure has been completely destroyed and we're now faced with the challenge of recreating the state file from scratch. My option is to use the Terraform import command, using the following simple syntax:
terraform import <Terraform Resource Name>.<Resource Label> <Azure Resource ID>
To import the existing resource group for example, I will create the following configuration in a main.tf file.
provider "azurerm" {
version="1.39.0"
}
# create resource group
resource "azurerm_resource_group" "rg"{
name = "rg-terraform"
location = "uksouth"
}
Now, the problem I have is as follows:
When the existing Azure resources were originally created, they were assigned names that used an extremely complex naming convention, with some characters even being randomly generated. To further compound matters, they were all unique and there are hundreds of them. All would have been rosy if they were assigned a simplistic name like "main", as is used commonly in a lot of Terraform examples, but unfortunately, that's not the case.
My question therefore, is this:
When putting together my main.tf configuration file to be used for the Import, is it an absolute requirement that my "Resource Label" (given in my Import command) has to match the original "Resource Label" name from when the resource was created?
If it is a mandatory requirement, is there any way I could retrieve the original "Resource Label" from Azure in the same way that I can for instance obtain the "Azure Resource ID" from the Azure Portal or even an Az CLI query?
How can I ensure any child resources such as Subnets are included in the Import, without having to trawl manually through the Azure Portal to identify each one of them?
No, absolutely not. Choose whatever you want.
No, Azure generally does not know about this label, it is something terraform internal.
Unfortunately you need to import each and every resource manually and separately.
Have you made absolutely sure the current state file is lost? The storage location was not versioned? Does no developer still have a local copy of the state file laying around?

Terraform resource with the ID already exists

When Terraform run task executes in azure devops release pipeline I get an error "A resource with the ID already exists".
The resource exists in Azure but why it is complaining about the resource if this already exists. This should ignore this part. Please help what I need to add in my code that will fix this error!
Am I just using this bugging terraform tool for deploying azure resource? Terraform help is terrible!!!
resource "azurerm_resource_group" "test_project" {
name = "${var.project_name}-${var.environment}-rg"
location = "${var.location}"
tags = {
application = "${var.project_name}"
}
}
Terraform is designed to allow you to manage only a subset of your infrastructure with a particular Terraform configuration, in case either some objects are managed by another tool or in case you've decomposed your infrastructure to be managed by many separate configurations that cooperate to produce the desired result.
As part of that design, Terraform makes a distinction between an object existing in the remote system and that object being managed by the current Terraform configuration. Where technical constraints of an underlying API allow it, Terraform providers will avoid implicitly taking ownership of something that was not created by that specific Terraform configuration. The error message you saw here is the Azure provider's implementation of that, where it pre-checks to make sure the name you give it is unique so that it won't overwrite (and thus take implicit ownership of) an object created elsewhere.
To proceed here you have two main options, depending on your intended goal:
If this object was formerly managed by some other system and you now want to manage it exclusively with this Terraform configuration, you can tell Terraform to associate the existing object with the resource block you've written and thus behave as if that object were originally created by that resource block:
terraform import azurerm_resource_group.test_project /subscriptions/YOUR-SUBSCRIPTION-ID/resourceGroups/PROJECTNAME-ENVIRONMENTNAME-rg
After you run terraform import you must ensure that whatever was previously managing that object will no longer associate with it. This object is now owned by this Terraform configuration and must not be changed by any other system.
If this object is managed by some other system and you wish to continue managing it that way then you can instead use a data block to retrieve information about that existing object to use elsewhere in your configuration without Terraform taking ownership:
data "azurerm_resource_group" "example" {
name = "${var.project_name}-${var.environment}-rg"
}
If you needed the resource group's location name elsewhere in your module, for example, you could use data.azurerm_resource_group.example.location to access it. If you wanted to make any later changes to this resource group, you would continue to do that using whichever other system is considered the owner of it in your environment.
The main difference between these two approaches is how Terraform will record the object in state snapshots. terraform import causes Terraform to create a binding between the resource configuration you wrote and the remote object whose id you gave on the command line, which is henceforth indistinguishable to Terraform from it having created that object and recorded the binding itself in the first place. For a data resource, Terraform just reads the data about the existing object and saves a cache of it in the state so it can determine if the value has changed on a future run; it will never plan to make any modifications to an object used with a data block.
Try to delete the .terraform local folder to clean the cache, then run terraform init again and retry running the pipeline.
For my future self:
Today I stumbled across this same problem, because I renamed some resources, and terraform could not track them. I found out about terraform state mv ... which gives you the ability to rename resources in your state file, so that it can track remote resources. Really useful.

How terraform know which resource should it run first to spin up infrastructure?

I’m using terraform to spin up Aws-DMS. To spin up DMS, we need subnet groups, dms replication task, dms endpoints, dms replication instance. I’ve configured everything using terraform documentation. My question is how will terraform know which task to be completed first to spin up other dependency tasks?
Do we need to declare it somewhere in terraform or is terraform intelligent enough to run accordingly?
Terraform uses references in the configuration to infer ordering.
Consider the following example:
resource "aws_s3_bucket" "example" {
bucket = "terraform-dependencies-example"
acl = "private"
}
resource "aws_s3_bucket_object" "example" {
bucket = aws_s3_bucket.example.bucket # reference to aws_s3_bucket.example
key = "example"
content = "example"
}
In the above example, the aws_s3_bucket_object.example resource contains an expression that refers to aws_s3_bucket.example.bucket, and so Terraform can infer that aws_s3_bucket.example must be created before aws_s3_bucket_object.example.
These implicit dependencies created by references are the primary way to create ordering in Terraform. In some rare circumstances we need to represent dependencies that cannot be inferred by expressions, and so for those exceptional circumstances only we can add additional explicit dependencies using the depends_on meta argument.
One situation where that can occur is AWS IAM policies, where the graph created naturally by references will tend to have the following shape:
Due to AWS IAM's data model, we must first create a role and then assign a policy to it as a separate step, but the objects assuming that role (in this case, an AWS Lambda function just for example) only take a reference to the role itself, not to the policy. With the dependencies created implicitly by references then, the Lambda function could potentially be created before its role has the access it needs, causing errors if the function tries to take any actions before the policy is assigned.
To address this, we can use depends_on in the aws_lambda_function resource block to force that extra dependency and thus create the correct execution order:
resource "aws_iam_role" "example" {
# ...
}
resource "aws_iam_role_policy" "example" {
# ...
}
resource "aws_lambda_function" "exmaple" {
depends_on = [aws_iam_role_policy.example]
}
For more information on resource dependencies in Terraform, see Resource Dependencies in the Terraform documentation.
Terraform will automatically create the resources in an order that all dependencies can be fulfilled.
E.g.: If you set a security group id in your DMS definition as "${aws_security_group.my_sg.id}", Terraform recognizes this dependency and created the security group prior to the DMS resource.

Why do some resources have a name and a "name" attribute?

I am new to Terraform and trying to create some resources on Azure. To me it looks like there is some unnecessary duplication between the resource name and the attribute name in the definitions.
resource "azurerm_resource_group" "group_name" {
name = "group_name" # <-- repeated!
location = "${local.location}"
}
Is there a difference? Can I somehow set them to be the same in the spirit of this:
resource "azurerm_resource_group" "group_name" {
name = "${name}"
location = "${local.location}"
}
The two names here serve different purposes and have different scopes.
The name that appears in the block header is a local name used within a single Terraform module. It is useful when interpolating results from one resource into another, like ${azurerm_resource_group.group_name}. The remote API never sees this name; it is used only for internal references.
The name within the block is an attribute specific to the resource type itself -- azurerm_resource_group in this case. This name will be sent to the remote API and will be how the object is described within the AzureRM system itself.
In simple configurations within small organizations it is indeed possible that both of these names could match. In practice, the difference in scope between these names causes them to often vary. For example:
If there are multiple separate teams or applications sharing an AzureRM account, the name used with the API may need to be prefixed to avoid collisions with names created by other teams or applications, while the local name needs to be unique only within the module where it's defined.
In more complex usage with child modules, it's common to instantiate the same child module multiple times. In this case, the local name will be the same between all of the instances (because it's significant only within that instance) but the name used with the API will need to be adjusted for each instances so that they don't collide.
The resource name is the name you use to refer to the resource in Terraform context. The name parameter is the name given to the resource inside your provider's context. Resource don't have to have a name parameter, for example AWS Elastic IP resource doesn't have a name because AWS doesn't allow you to name them. Some of the resource like AWS Security group rule don't even translate one to one to resources you can name.

Resources