terraform: How to destroy oldest instance when lowering aws_instance count - terraform

Given a pair of aws instances deployed with
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "example" {
count = 2
ami = "ami-2757f631"
instance_type = "t2.micro"
tags = {
Name = "Test${count.index}"
}
}
Lowering count = 1 will destroy the last instance deployed:
Terraform will perform the following actions:
- aws_instance.example[1]
Is it possible to get terraform to destroy the first instance. ie.
Terraform will perform the following actions:
- aws_instance.example[0]

Terraform is tracking which instance is which via its state. When you reduce your count on the aws_instance resource Terraform will simply remove the later instances. While this shouldn't really be much of an issue because I would only really recommend that you are deploying groups of homogenous instances that can handle the load being interrupted (and would sit behind some form of load balancer mechanism) if you really needed to you could edit the state file to reorder the instances before reducing the number of instances.
The state file is serialised as JSON so you can just edit it directly (making sure it's uploaded to whatever you're using for remote state if you are using remote state) or better yet you can use the first class tools for editing remote state that the Terraform CLI provides with terraform state mv.
As an example you can do this:
# Example from question has been applied already
# `count` is edited from 2 to 1
$ terraform plan
...
aws_instance.example[1]: Refreshing state... (ID: i-0c227dfbfc72fb0cd)
aws_instance.example: Refreshing state... (ID: i-095fd3fdf86ce8254)
------------------------------------------------------------------------
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- aws_instance.example[1]
Plan: 0 to add, 0 to change, 1 to destroy.
...
$
$
$
$ terraform state list
aws_instance.example[0]
aws_instance.example[1]
$
$
$
$ terraform state mv aws_instance.example[1] aws_instance.example[2]
Moved aws_instance.example[1] to aws_instance.example[2]
$ terraform state mv aws_instance.example[0] aws_instance.example[1]
Moved aws_instance.example[0] to aws_instance.example[1]
$ terraform state mv aws_instance.example[2] aws_instance.example[0]
Moved aws_instance.example[2] to aws_instance.example[0]
$
$
$
$ terraform plan
...
aws_instance.example[1]: Refreshing state... (ID: i-095fd3fdf86ce8254)
aws_instance.example: Refreshing state... (ID: i-0c227dfbfc72fb0cd)
------------------------------------------------------------------------
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
- destroy
Terraform will perform the following actions:
~ aws_instance.example
tags.Name: "Test1" => "Test0"
- aws_instance.example[1]
Plan: 0 to add, 1 to change, 1 to destroy.
...

Related

how does terraform state and it's refresh work

I'm new to terraform and want to understand how the terraform state and the refresh work together. As I understand, you write the desired resources in a .tf file and run 'terraform apply' and the resources are created, if they did not exist. Also, the terraform.tfstate file is created, containing all the metadata from terraform's viewpoint about the resource. This way, terraform can compare the desired state (in the .tf file) to the current state (in the terraform.state file) . So far so good.
But what happens, if there is any change on the 'real world' resource?
Shouldn't terraform recognize this? This should be done by the 'refresh' of terraform, or?
However, I've experimented a bit and created this simple resource:
resource "local_file" "mypet" {
filename = "/tmp/mypet.txt"
content = "I love pets"
file_permission = "0755"
}
Running 'terraform init && terraform apply' created the file /tmp/mypet.txt with the content "I love pets". Cool.
Then I changed the file permissions:
chmod 0444 /tmp/mypet.txt"
ls -l /tmp/mypet.txt
-r--r--r--. 1 devel devel 11 Nov 25 09:28 /tmp/mypet.txt
And another 'terraform apply':
local_file.mypet: Refreshing state... [id=4246687863ee17bf4daf5fc376ab01222e989aca]
No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Actually, terraform does not recognize this real world change and that the target configuration has drifted from the state known to terraform!
Another interesting example: I did a vi /tmp/mypet.txt but did not change the content at all. However, I left the vi using :wq . And this changes the checksum of the file:
sha256sum /tmp/mypet.txt
595e30750adb4244cfcfc31de9b37d1807c20e7ac2aef049755b6c0e5f580170 /tmp/mypet.txt
[devel#fedora test]$ !vi
vi /tmp/mypet.txt
[devel#fedora test]$ sha256sum /tmp/mypet.txt
4fcc76cae3c8950e9633ca8220e36f4313a97d0f436ab050fd1d586b40d682f0 /tmp/mypet.txt
This is recognized by terraform and thus it wants to recreate the file:
terraform apply
local_file.mypet: Refreshing state... [id=4246687863ee17bf4daf5fc376ab01222e989aca]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the
following symbols:
+ create
Terraform will perform the following actions:
# local_file.mypet will be created
+ resource "local_file" "mypet" {
+ content = "I love pets"
+ directory_permission = "0777"
+ file_permission = "0755"
+ filename = "/tmp/mypet.txt"
+ id = (known after apply)
}
Plan: 1 to add, 0 to change, 0 to destroy.
Means, terraform this time recognized the real-world change.
When I call 'terraform apply' without refreshing, it does not see the change that happened in the real world, as expected:
terraform apply --refresh=false
No changes. Your infrastructure matches the configuration.
Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
It seems, that only some attributes of the file resource-type of the local provider are able to recognize a drift from the terraform state.
Is that true? Is this documented somewhere, if an attribute of a resource-type is "drift-aware" or not?

Terraform: Cannot find modules moved to new state

I am moving s3 buckets between states. Specifically I am moving 2 buckets from state_a to state_b. After copying the Terraform code to state_b - on the root folder - I am running the following commands:
terraform state mv -state=terraform.tfstate -state-out=../state_b/terraform.tfstate module.bucket_a module.bucket_a
terraform state mv -state=terraform.tfstate -state-out=../state_b/terraform.tfstate module.bucket_b module.bucket_b
I am running these commands under the state_a folder. The output of both is
Move "module.bucket_a" to "module.bucket_a"
Successfully moved 1 object(s).
and the same for bucket b
Then I remove the code for these buckets from state_a and run a terraform plan under state_a. Everything looks good:
Plan: 0 to add, 0 to change, 0 to destroy.
Then I run terraform plan under state_b. I would expect to see the same result - 0 to add, 0 to change and 0 to destroy. Unfortunately, I am seeing the following result:
Plan: 2 to add, 0 to change, 0 to destroy.
# module.bucket_a.aws_s3_bucket.retention_bucket[0] will be created
+ resource "aws_s3_bucket" "retention_bucket" {
....
# module.bucket_b.aws_s3_bucket.retention_bucket[0] will be created
+ resource "aws_s3_bucket" "retention_bucket" {
....
So it seems like the move was not successful. I am trying to terraform state list under state_b to see all the modules in hope that I moved them under a different address but I cannot find anything under bucket_a or bucket_b name.
What did I do wrong?

"Invalid legacy provider address" error on Terraform

I'm trying to deploy a bitbucket pipeline using terraform v0.14.3 to create resources in google cloud. after running terraform command, the pipeline fails with this error:
Error: Invalid legacy provider address
This configuration or its associated state refers to the unqualified provider
"google".
You must complete the Terraform 0.13 upgrade process before upgrading to later
versions.
We updated our local version of terraform to v.0.13.0 and then ran: terraform 0.13upgrade as referenced in this guide: https://www.terraform.io/upgrade-guides/0-13.html. A versions.tf file was generated requiring terraform version >=0.13 and our required provider block now looks like this:
terraform {
backend "gcs" {
bucket = "some-bucket"
prefix = "terraform/state"
credentials = "key.json" #this is just a bitbucket pipeline variable
}
required_providers {
google = {
source = "hashicorp/google"
version = "~> 2.20.0"
}
}
}
provider "google" {
project = var.project_ID
credentials = "key.json"
region = var.project_region
}
We still get the same error when initiating the bitbucket pipeline. Does anyone know how to get past this error? Thanks in advance.
Solution
If you are using a newer version of Terraform, such as v0.14.x, you should:
use the replace-provider subcommand
terraform state replace-provider \
-auto-approve \
"registry.terraform.io/-/google" \
"hashicorp/google"
#=>
Terraform will perform the following actions:
~ Updating provider:
- registry.terraform.io/-/google
+ registry.terraform.io/hashicorp/google
Changing x resources:
. . .
Successfully replaced provider for x resources.
initialize Terraform again:
terraform init
#=>
Initializing the backend...
Initializing provider plugins...
- Reusing previous version of hashicorp/google from the dependency lock file
- Using previously-installed hashicorp/google vx.xx.x
Terraform has been successfully initialized!
You may now begin working with Terraform. Try . . .
This should take care of installing the provider.
Explanation
Terraform only supports upgrades from one major feature upgrade at a time. Your older state file was, more than likely, created using a version earlier than v0.13.x.
If you did not run the apply command before you upgraded your Terraform version, you can expect this error: the upgrade from v0.13.x to v0.14.x was not complete.
You can find more information here.
in our case, we were on aws and had similar error
...
Error: Invalid legacy provider address
This configuration or its associated state refers to the unqualified provider
"aws".
the steps to resolve were :
ensure syntax was upgraded by running terraform init again
check the warnings and resolve them
and finally updating the statefile with following method.
# update provider in state file
terraform state replace-provider -- -/aws hashicorp/aws
# reinit
terraform init
specific to ops problem, if issue still occurs, verify access to the bucket location from local and from pipeline. also verify the version of terraform running in pipeline. depending on configuration it may be the remote statefile is/can not be updated.
Same issue for me. I ran:
terraform providers
That gave me:
Providers required by configuration:
registry.terraform.io/hashicorp/google
Providers required by state:
registry.terraform.io/-/google
So I ran:
terraform state replace-provider registry.terraform.io/-/google registry.terraform.io/hashicorp/google
That did the trick.
To add on, I had installed terraform 0.14.6 but the state seemed to be stuck in 0.12. In my case I had 3 references that were off, this article helped me pinpoint which ones (all the entries in "Providers required by state" which had a - in the link. https://github.com/hashicorp/terraform/issues/27615
I corrected it by running the replace-provider command for each entry which was off, then running terraform init. I note doing this and running a git diff, the tfstate has been updated and now uses 0.14.x terraform instead of my previous 0.12.x. i.e.
terraform providers
terraform state replace-provider registry.terraform.io/-/azurerm registry.terraform.io/hashicorp/azurerm
Explanation: Your terraform project contains tf.state file that is outdated and refereeing to old provider address. The Error message will present this error:
Error: Invalid legacy provider address
This configuration or its associated state refers to the unqualified provider
<some-provider>.
You must complete the Terraform <some-version> upgrade process before upgrading to later
versions.
Solution: In order to solve this issue you should change the tf.state references to link to the newer required providers, update the tf.state file and initialize the project again. The steps are:
Create / Edit the required providers block with the relevant package name and version, I'd rather doing it on versions.tf file.
example:
terraform {
required_version = ">= 0.14"
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 3.35.0"
}
}
}
Run terraform providers command to present the required providers from configuration against the required providers that saved on state.
example:
Providers required by configuration:
.
├── provider[registry.terraform.io/hashicorp/aws] >= 3.35.0
Providers required by state:
provider[registry.terraform.io/-/aws]
Switch and reassign the required provider source address in the terraform state ( using terraform state replace-provider command) so we can tell terraform how to interpret the legacy provider.
The terraform state replace-provider subcommand allows re-assigning
provider source addresses recorded in the Terraform state, and so we
can use this command to tell Terraform how to reinterpret the "legacy"
provider addresses as properly-namespaced providers that match with
the provider source addresses in the configuration.
Warning: The terraform state replace-provider subcommand, like all of
the terraform state subcommands, will create a new state snapshot and
write it to the configured backend. After the command succeeds the
latest state snapshot will use syntax that Terraform v0.12 cannot
understand, so you should perform this step only when you are ready to
permanently upgrade to Terraform v0.13.
example:
terraform state replace-provider registry.terraform.io/-/aws registry.terraform.io/hashicorp/aws
output:
~ Updating provider:
- registry.terraform.io/-/aws
+ registry.terraform.io/hashicorp/aws
run terraform init to update references.
While you were under TF13 did you apply state at least once for the running project?
According to TF docs: https://www.terraform.io/upgrade-guides/0-14.html
There is no automatic update command (separately) in 0.14 (like there was in 0.13). The only way to upgrade is to force state on a project at least once, while under command when moving TF13 to 14.
You can also try terraform init in the project directory.
my case was like this
Error: Invalid legacy provider address
This configuration or its associated state refers to the unqualified provider
"openstack".
You must complete the Terraform 0.13 upgrade process before upgrading to later
versions.
for resolving the issue
remove the .terraform folder
the execute the following command
terraform state replace-provider -- -/openstack terraform-provider-openstack/openstack
after this command, you will see the below print, enter yes
Terraform will perform the following actions:
~ Updating provider:
- registry.terraform.io/-/openstack
+ registry.terraform.io/terraform-provider-openstack/openstack
Changing 11 resources:
openstack_compute_servergroup_v2.kubernetes_master
openstack_networking_network_v2.kube_router
openstack_compute_instance_v2.kubernetes_worker
openstack_networking_subnet_v2.internal
openstack_networking_subnet_v2.kube_router
data.openstack_networking_network_v2.external_network
openstack_compute_instance_v2.kubernetes_etcd
openstack_networking_router_interface_v2.internal
openstack_networking_router_v2.internal
openstack_compute_instance_v2.kubernetes_master
openstack_networking_network_v2.internal
Do you want to make these changes?
Only 'yes' will be accepted to continue.
Enter a value: yes
Successfully replaced provider for 11 resources.
I recently ran into this using Terraform Cloud for the remote backend. We had some older AWS-related workspaces set to version 0.12.4 (in the cloud) that errored out with "Invalid legacy provider address" and refused to run with the latest Terraform client 1.1.8.
I am adding my answer because it is much simpler than the other answers. We did not do any of the following:
terraform providers
terraform 0.13upgrade
remove the .terraform folder
terraform state replace-provider
Instead we simply:
In a clean folder (no local state, using local terraform.exe version 0.13.7) ran 'terraform init'
Made a small insignificant change (to ensure apply would write state) to a .tf file in the workspace
In Terraform Cloud set the workspace version to 0.13.7
Using local 0.13.7 terraform.exe ran apply - that saved new state.
Now we can use cloud and local terraform.exe version 1.1.8 and no more problems.
Note that we did also need to update a few AWS S3-related resources to the newer AWS provider syntax to get all our workspaces working with the latest provider.
We encountered a similar problem in our operational environments today. We successfully completed the terraform 0.13upgrade command. This indeed introduced a versions.tf file.
However, performing a terraform init with this setup was still not possible, and the following error popped up:
Error: Invalid legacy provider address
Further investigation in the state file revealed that, for some resources, the provider block was not updated. We hence had to run the following command to finalize the upgrade process.
terraform state replace-provider "registry.terraform.io/-/google" "hashicorp/google"
EDIT Deployment to the next environment revealed that this was caused by conditional resources. To easily enable/disable some resources we leverage the count attribute and use either 0 or 1. For the resources with count = 0, that were unaltered with Terraform 0.13, the provider was not updated.
I was using terragrunt with remote s3 state and dynamo db and sadly this does not work for me. So posting it here might help someone else.
A long way to make this work, as terragrunt state replace-provider does work for me
download the state file from s3
aws s3 cp s3://bucket-name/path/terraform.tfstate terraform.tfstate --profile profile
replace the provider using terraform
terraform state replace-provider "registry.terraform.io/-/random" "hashicorp/random"
terraform state replace-provider "registry.terraform.io/-/aws" "hashicorp/aws"
upload the state file back to s3 as even terragrunt state push terraform.tfstate does not work for me
aws s3 cp terraform.tfstate s3://bucket-name/path/terraform.tfstate --profile profile
terragrunt apply
the command will throw error with digest value,
update the dynamo db table digest value that received in previous command
Initializing the backend...
Error refreshing state: state data in S3 does not have the expected content.
This may be caused by unusually long delays in S3 processing a previous state
update. Please wait for a minute or two and try again. If this problem
persists, and neither S3 nor DynamoDB are experiencing an outage, you may need
to manually verify the remote state and update the Digest value stored in the
DynamoDB table to the following value: fe2840edf8064d9225eea6c3ef2e5d1d
finally, run terragrunt apply
The other way that this can be strange is if you are using terraform workspaces - especially with the remote state files.
Using a terraform workspace - the order of operations is important.
terraform init - connecting to the default workspace
terraform workspace switch <env> - Even if you have specified the workspace here, the init will happen using the default workspace.
This is an assumption that terraform makes - sometimes erroneously
To fix this - you can run your init using:
TF_WORKSPACE=<your_env> terraform init
Or remove the default workspace.

Terraform long term lock

Using Terraform 0.12 with the remote state in an S3 bucket with DynamoDB locking.
It seems that a common pattern for Terraforming automation goes more or less like this:
terraform plan -out=plan
[review plan]
terraform apply plan
But then, maybe I'm overlooking something obvious, there's no guarantee other terraform apply invocations haven't updated the infrastructure between 1 and 3 above.
I know locking will prevent a concurrent run of terraform apply while another one is running (and locking is enabled) but can I programmatically grab a "long term locking" so the effective workflow looks like this?
[something to the effect of...] "terraform lock"
terraform plan -out=plan
[review plan]
terraform apply plan
[something to the effect of...] "terraform release lock"
Are there any other means to "protect" infrastructure from concurrent/interdependant updates that I'm overlooking?
You don't need this as long as you are only worrying about the state file changing.
If you provide a plan output file to apply and the state has changed since then Terraform will error before making any changes, complaining that the saved plan is stale.
As an example:
$ cat <<'EOF' >> main.tf
> resource "random_pet" "bucket_suffix" {}
>
> resource "aws_s3_bucket" "example" {
> bucket = "example-${random_pet.bucket_suffix.id}"
> acl = "private"
>
> tags = {
> ThingToChange = "foo"
> }
> }
> EOF
$ terraform init
# ...
$ terraform apply
# ...
$ $ sed -i 's/foo/bar/' main.tf
$ terraform plan -out=plan
# ...
$ sed -i 's/bar/baz/' main.tf
$ terraform apply
# ...
$ terraform apply plan
Error: Saved plan is stale
The given plan file can no longer be applied because the state was changed by
another operation after the plan was created.
What it won't do is fail if something outside of Terraform has changed anything. So if instead of applying Terraform again with baz as the tag for the bucket I had changed the tag on the bucket via the AWS CLI or the AWS console then Terraform would have happily changed it back to bar on the apply with the stale plan.

Cannot taint null_resource

I got terraform 0.11.11.
The graph show that the resource in speaking is in root module
$ terraform graph
digraph {
compound = "true"
newrank = "true"
subgraph "root" {
"[root] data.template_file.default" [label = "data.template_file.default", shape = "box"]
"[root] data.template_file.etcd" [label =
...
"[root] null_resource.service_resolv_conf" [label = "null_resource.service_resolv_conf", shape = "box"]
...
But the trying to taint it says it is not:
$ terraform taint null_resource.service_resolv_conf
The resource null_resource.service_resolv_conf couldn't be found in the module root.
updates
$ terraform state list|grep resolv_conf
null_resource.service_resolv_conf[0]
null_resource.service_resolv_conf[1]
then i tried:
$ terraform taint null_resource.service_resolv_conf[0]
The resource null_resource.service_resolv_conf[0] couldn't be found in the module root.
and
$ terraform taint null_resource.service_resolv_conf
The resource null_resource.service_resolv_conf couldn't be found in the module root.
terraform graph gives your the whole picture about the resources and their relationship.
But it is not a good command for troubleshooting and understand how the resources are named in terraform *.tfstate file.
I would recommend to run with terraform state list, then you can easily know how to taint one of the resources in list.
terraform state list
terraform taint <copy resource directly from above list>
For whoever who comes into this thread looking for terraform taint/untaint null_resource where terraform errors out with The resource […] couldn't be found in the module root here's the correct and working answer posted by #victor-m at Cannot taint null_resource
terraform taint -module=name null_resource.name
Same for untaint command.
After all I found out the solution
It appears, then when there are more hosts in connection based on list (used 'count')
resource "null_resource" "provision_docker_registry" {
count = "${length(local.service_vms_names)}"
depends_on = ["null_resource.service_resolv_conf"]
connection {
user = "cloud-user"
private_key = "${file("${path.module}/../ssh/${var.os_keypair}.pem")}"
host = "${element(local.service_fip, count.index)}"
}
You taint the resource by specifying index after dot, i.e.
$ terraform taint null_resource.provision_docker_registry.0
The resource null_resource.provision_docker_registry.0 in the module root has been marked as tainted!
Voila!
I could have not found that in documentation. Hope this helps someone.

Resources