Terraform: when would you manually change the state? - terraform

According to the docs:
As your Terraform usage becomes more advanced, there are some cases where you may need to modify the Terraform state.
Under what circumstances would you want to directly change terraform's state?
It seems like a very dangerous practice to do it as opposed to changing the terraform code itself.

You are correct in thinking that it can be dangerous to modify the state file as this could corrupt the state file or cause Terraform to do things that you don't want it to as the state file drifts from your changes to the actual state of the provider it is operating against.
However, there are times when you may want to modify the state file such as for adding resources you created outside of the Terraform state file (either being created outside of Terraform entirely or just with a different state file), using the terraform import command or for renaming Terraform config resources using the terraform state commands.
For example, if you start off with defining a resource directly with something like:
variable "ami_name" {
default = "ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"
}
variable "ami_owner" {
default = "099720109477" # Canonical
}
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = [var.ami_name]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = [var.ami_owner]
}
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t2.micro"
tags {
Name = "HelloWorld"
}
}
And then you later decide to refactor this to a module so that others can call it with something like:
module "instance" {
ami_name = "my-image-name-*"
ami_owner = "123456789"
}
When you run a plan after this refactoring Terraform will tell you that it wants to remove the aws_instance.web resource and coincidentally create a resource with the same parameters called module.instance.aws_instance.web.
If you want to do this without causing an outage as Terraform destroys the old resource and replaces it with the new one then you could simply edit the state file to change the name of the resource with:
terraform state mv aws_instance.web module.instance.aws_instance.web
If you then run a plan it will show an empty change, successfully completing your refactoring without causing any impact on your deployed instance.

Related

How to change the name of terrafrom resource in tf file wiithout terrafrom registerring a change in plan?

today I imported my cloud instance to terraform
resource "linode_domain" "example_domain" {
domain = var.primary_domain_name
soa_email = var.domain_soa_email
type = "master"
}
After I imported the instance to terrafrom using the terrafrom import command. I realized I was suppose to name example_domain as primary_domain.
Now if I change example_domain to primary_domain directly in the tf file the terrafrom plan registers as change in the plan, which I do not want to! so i want to know how can I rename this resource ?
Probably the easies way to rename a resource would be to make a copy for resource block with the new name and then run a terraform state mv command.
So, in your case, we duplicate the resource with a new name:
resource "linode_domain" "example_domain" {
domain = var.primary_domain_name
soa_email = var.domain_soa_email
type = "master"
}
resource "linode_domain" "primary_domain" {
domain = var.primary_domain_name
soa_email = var.domain_soa_email
type = "master"
}
We run state mv:
terraform state mv linode_domain.example_domain linode_domain.primary_domain
We remove the old (example_domain) resource block from the code.
The procedure above is also documented in the official Terraform docs for state mv.
If you are using Terraform v1.1 or later then you can use the new config-based refactoring features, which specifically means a moved block in this case:
resource "linode_domain" "primary_domain" {
domain = var.primary_domain_name
soa_email = var.domain_soa_email
type = "master"
}
moved {
from = linode_domain.example_domain
to = linode_domain.primary_domain
}
The above configuration tells Terraform that if it finds an object in the prior state bound to the address linode_domain.example_domain, it should pretend that it was instead bound to linode_domain.primary_domain before making any other decisions.
Terraform will include an extra annotation on the planned changes for that resource to say that its address changed, but should not propose any changes to the remote object itself unless you've also changed some other settings in the resource block.

Terraform execution order with locals block

We have a requirement with one of the terraform scripts to execute a python script, generate the output, and read the output file. We are trying to achieve this through the below method,
resource "null_resource" "get_data_plane_ip" {
provisioner "local-exec" {
command = "python myscript.py > output.json"
}
triggers = {
always_run = "${timestamp()}"
}
}
locals {
var1 = jsondecode(file("output.json"))
}
The problem with the above method is, we have seen locals block gets executed before the python script gets executed through local-exec resource. So the terraform apply fails. we can't use depends_on in locals block to specify the order as well.
Any suggestion on how can we make sure locals gets executed only after local-exec resource?
You could potentially use another null_resource resource in this situation.
For example, take the following configuration:
resource "null_resource" "get_data_plane_ip" {
provisioner "local-exec" {
command = "python myscript.py > output.json"
}
triggers = {
always_run = timestamp()
}
}
resource "null_resource" "dependent" {
triggers = {
contents = file("output.json")
}
depends_on = [null_resource.get_data_plane_ip]
}
locals {
var1 = jsondecode(null_resource.b.triggers.contents)
}
output "var1" {
value = local.var1
}
The null_resource.dependent resource has an explicit dependency on the null_resource.get_data_plane_ip resource. Therefore, it will wait for the null_resource.get_data_plane_ip resource to be "created".
Since the triggers argument is of type map(string), you can use the file function to read the contents of the output.json file, which returns a string.
You can then create a local variable to invoke jsondecode on the triggers attribute of the null_resource.dependent resource.
The documentation for the file function says the following:
This function can be used only with files that already exist on disk at the beginning of a Terraform run. Functions do not participate in the dependency graph, so this function cannot be used with files that are generated dynamically during a Terraform operation. We do not recommend using dynamic local files in Terraform configurations, but in rare situations where this is necessary you can use the local_file data source to read files while respecting resource dependencies.
There a few different things to note in this paragraph. The first is that the documentation recommends against doing what you are doing except as a last resort; I don't know if there's another way to get the result you were hoping for, so I'll leave you to think about that part, and focus on the other part...
The local_file data source is a data resource type that just reads a file from local disk. Because it appears as a resource rather than as a language operator/function, it'll be a node in Terraform's dependency graph just like your null_resource.get_data_plane_ip, and so it can depend on that resource.
The following shows one way to write that:
resource "null_resource" "get_data_plane_ip" {
triggers = {
temp_filename = "${path.module}/output.json"
}
provisioner "local-exec" {
command = "python myscript.py >'${self.triggers.temp_filename}'"
}
}
data "local_file" "example" {
filename = null_resource.get_data_plane_ip.triggers.temp_filename
}
locals {
var1 = jsondecode(data.local_file.example.content)
}
Note that this sort of design will make your Terraform configuration non-converging, which is to say that you can never reach a point where terraform apply will report that there are no changes to apply. That's often undesirable, because a key advantage of the declarative approach is that you can know you've reached a desired state and thus your remote system matches your configuration. If possible, I'd suggest to try to find an alternative design that can converge after selecting a particular IP address, though that'd typically mean representing that "data plane IP" as a resource itself, which may require writing a custom provider if you're interacting with a bespoke system.
I've not used it myself so I can't recommend it, but I notice that there's a community provider in the registry which offers a shell_script_resource resource type, which might be useful as a compromise between running commands in provisioners and writing a whole new provider. It seems like it allows you to write a script for create where the result would be retained as part of the resource state, and thus you could refer to it from other resources.

Create resource via terraform but do not recreate if manually deleted?

I want to initially create a resource using Terraform, but if the resource gets later deleted outside of TF - e.g. manually by a user - I do not want terraform to re-create it. Is this possible?
In my case the resource is a blob on an Azure Blob storage. I tried using ignore_changes = all but that didn't help. Every time I ran terraform apply, it would recreate the blob.
resource "azurerm_storage_blob" "test" {
name = "myfile.txt"
storage_account_name = azurerm_storage_account.deployment.name
storage_container_name = azurerm_storage_container.deployment.name
type = "Block"
source_content = "test"
lifecycle {
ignore_changes = all
}
}
The requirement you've stated is not supported by Terraform directly. To achieve it you will need to either implement something completely outside of Terraform or use Terraform as part of some custom scripting written by you to perform a few separate Terraform steps.
If you want to implement it by wrapping Terraform then I will describe one possible way to do it, although there are various other variants of this that would get a similar effect.
My idea for implementing it would be to implement a sort of "bootstrapping mode" which your custom script can enable only for initial creation, but then for subsequent work you would not use the bootstrapping mode. Bootstrapping mode would be a combination of an input variable to activate it and an extra step after using it.
variable "bootstrap" {
type = bool
default = false
description = "Do not use this directly. Only for use by the bootstrap script."
}
resource "azurerm_storage_blob" "test" {
count = var.bootstrap ? 1 : 0
name = "myfile.txt"
storage_account_name = azurerm_storage_account.deployment.name
storage_container_name = azurerm_storage_container.deployment.name
type = "Block"
source_content = "test"
}
This alone would not be sufficient because normally if you were to run Terraform once with -var="bootstrap=true" and then again without it Terraform would plan to destroy the blob, after noticing it's no longer present in the configuration.
So to make this work we need a special bootstrap script which wraps Terraform like this:
terraform apply -var="bootstrap=true"
terraform state rm azurerm_storage_blob.test
That second terraform state rm command above tells Terraform to forget about the object it currently has bound to azurerm_storage_blob.test. That means that the object will continue to exist but Terraform will have no record of it, and so will behave as if it doesn't exist.
If you run the bootstrap script then, you will have the blob existing but with Terraform unaware of it. You can therefore then run terraform apply as normal (without setting the bootstrap variable) and Terraform will both ignore the object previously created and not plan to create a new one, because it will now have count = 0.
This is not a typical use-case for Terraform, so I would recommend to consider other possible solutions to meet your use-case, but I hope the above is useful as part of that design work.
If you have a resource defined in terraform configuration then terraform will always try to create it. I can't imagine what is your setup, but maybe you want to take the blob creation to a CLI script and run terraform and the script in desired order.

Terraform: Is it possible to depend on only 1 resource in a list of resources?

Lets say that I am creating a set of AWS instances:
resource "aws_instance" "provision" {
count = var.aws_azs
...
}
Then, in a separate null_resource to decouple the configuration step from the provisioning step:
resource "null_resource" "configure" {
count = var.aws_azs
depends_on = [aws_instance.provision[count.index]]
...
}
That dependency is illegal because depends_on requires a static reference. However, If I instead change it to depends_on = ["aws_instance.provision"], then all of the configuration resources will be tainted if any of the instances are tainted. Is there a way to depend on only 1 instance in a list of resources?
The depends_on meta-argument does not interact with the tainting mechanism at all. It is used only to help Terraform select a suitable order to perform actions in.
Given a configuration like this:
resource "null_resource" "configure" {
count = var.aws_azs
depends_on = [aws_instance.provision]
}
This just means that if a particular plan contains actions for both aws_instance.provision and null_resource.configure instances then all of the aws_instance.provision actions will be completed before starting any null_resource.configure actions. It will never cause any additional changes to be planned.
If your goal is to have the null_resource.configure instances be recreated when their corresponding aws_instance.provision instances are replaced, you can achieve that using triggers like this:
resource "null_resource" "configure" {
count = length(aws_instance.provision)
triggers = {
instance_id = aws_instance.provision[count.index].id
}
}
In this case, it's the value of aws_instance.provision[count.index].id that is the decider: as long as that id doesn't change the null_resource won't be recreated. Because the id of an aws_instance changes only when it is recreated, null_resource.configure therefore won't re-run unless a specific corresponding index is recreated.
Having a reference to aws_instance.provision in the triggers expression also creates an implicit dependency, so Terraform will also still ensure that all of the aws_instance.provision work is complete before starting any null_resource.configure work, without the need to additionally specify depends_on.

Will resources definition moving to a different module make 'terraform apply' to delete and recreate these resources?

I have created some VMs with a main.tf, and terraform generates a cluster.tfstate file.
Now because of refactoring, I move the VM resource definitions into a module, and refer to this module in main.tf. When I run terraform apply --state=./cluster.tfstate, will terraform destroy and recreate these VMs?
I would expect it will not. Is my understanding correct?
Let's try this using the example given in the aws_instance documentation:
# Create a new instance of the latest Ubuntu 14.04 on an
# t2.micro node with an AWS Tag naming it "HelloWorld"
provider "aws" {
region = "us-west-2"
}
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
owners = ["099720109477"] # Canonical
}
resource "aws_instance" "web" {
ami = "${data.aws_ami.ubuntu.id}"
instance_type = "t2.micro"
tags {
Name = "HelloWorld"
}
}
If we terraform apply this, we get an instance that is referenced within Terraform as aws_instance.web:
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
If we move this definition to a module ubuntu_instance, the directory structure might look like this with the above code in instance.tf:
.
├── main.tf
└── ubuntu_instance
└── instance.tf
Now you intend to create the same instance as before, but internally Terraform now names this resource module.ubuntu_instance.aws_instance.web
If you attempt to apply this, you would get the following:
Plan: 1 to add, 0 to change, 1 to destroy.
The reason this happens is that Terraform has no idea that the old and new code reference the same instance. When you refactor in a module, you are removing a resource, and thus Terraform deletes that resource.
Terraform maps your code to real resources in the state file. When you create an instance, you can only know that instance maps to your aws_instance because of the state file. So the proper way (as mentioned by Jun) is to refactor your code, then tell Terraform to move the mapping to the real instance from aws_instance.web to module.ubuntu_instance.aws_instance.web Then when you apply, Terraform will leave the instance alone because it matches what your code says. The article Jun linked to is a good discussion of this.

Resources