How to upgrade/update minor version of composer via Terraform? - terraform

I just recently created on my terraform base module a composer environment instance (resource google_composer_environment) with the following attribute : image_version = "composer-2.0.30-airflow-2.3.3".
When you try to upgrade via the console, the cluster is not destroyed (I think?), but when I try to do it via terraform using image_version = "composer-2.0.32-airflow-2.3.4" the environment will apparently be scrapped and recreated :
# module.dev-omni-orchestrator-instance.google_composer_environment.omni-orchestrator-instance must be replaced
...
Plan: 1 to add, 0 to change, 1 to destroy.
Is there a way to achieve these version upgrade within terraform without destroying the existing environment? Perhaps upgrade via the console or gcloud and then update+import somehow?

I had the same issue previously, to solve it :
Upgrade the version from the Cloud Composer console, in this case the cluster will not be destroyed
Upgrade the version in your Terraform code
When you will plan your Terraform Composer resource, Terraform will detect a change in the real infrastructure and will no plan change and nothing to update (because the version from the real infra is the same than in the Terraform resource)
I think it's an issue of the resource in the Google provider but it was not blocking for me with this technique.

Related

How to handle resource changes after provider upgrade in terraform?

I am trying to upgrade the azurerm terraform provider from 2.30.0 to 3.13.0. For sure there are several changes in some resources (e.g. resoruce name changes, renamed attributes, removed attributes, etc.). I checked the Azure Resource Manager Upgrade Guide and found those changes by which our configuration is affected.
For example in version 3.0.0 the attribute availibility_zones is replaced by zones for the azurerm_kubernetes_cluster_node_pool ressource. Therefore when i run terraform plan i get an error, that the attribute availibility_zones doesn't exists.
I found a migration guide from deprecated resources. I understood the idea of removing the resource from the state and importing it again by it's resource id, but there are also other resources like azurerm_subnet, azurerm_kubernetes_cluster, azurerm_storage_account they have resource changes, why the terraform import -var-file='./my.tfvars' [..] command fails.
I am not sure if it fails (only) because of the dependencies to some variables they are needed for declaring the resource properly. Or would it also fail because of reading the .tfvars and terraform compares the read variables with the state?
Actually i need a "best practice" guide how to handle resource changes after a provider update. I dont know where to start and where to end. I tried to visualize the dependencies with terraform graph and created a svg to try to figure out the order by which i have to migrate the resource changes. It's unpossible to understand the relations of the whole configuration.. I could also just remove attributes from the state file they doesnt exists anymore, or rename attributes manually.
So How to handle resource changes after provider upgrade in terraform?
General
I was able to update the provider properly - i hope at least. I would like to share my experience, maybe it would help other beginners. This is not a professional guide, but just my experience that i want to share.
First of all you have to remove ALL resources affected by the provider upgrade and then re-import them. What does that mean?
The new provider will contain divers changes on different resources. For example:
Removed deprecated attributes (attribute is completely removed)
Superseded attribute (attribute is replaced by another).
Renamed attributes
Superseded resources (here the resource can be deprecated or removed by the upgraded version)
Note
The migration guide describes how you can migrate from deprecated resources, but the workflow for attribute changes is the same. How i understood it. This is the only guide that i found.
terraform plan will show you one or several errors for affected resources.
If your terraform configuration is complex and huge, then you shouldn't try to remove and re-import them all at once. Just go step by step and fix one affected resource successively.
terraform plan can show changes although he shouldn't.
Check the force replacement attribute accurately and understand why terraform detects changes. It's seems be obvious but it doesn't have to.
There can be a type change e.g. int -> string
If the affected change is a kind of missing secret, then you can try to add the secret manually as the value to the related attribute in the state file and run terraform plan again.
Or there can be also a bug by the provider. So if you can't understand the detected change try to search the issues of the provider - mostly on github. Don't get confused if you can't find any related issue, maybe you have found a bug. Then just create a new issue.
You will also face some other errors or bugs related to terraform itself. You have to search for a workaround patiently, so that you can continue apply the resource changes.
Try to figure out resource changes or to ignore an error for this moment that occurs in another module with resource targeting.
How To
---> !! BACKUP YOUR STATE FILE !! <---: You have to backup your state file before you start manipulating the state file. You will be able to restore the state of the backed state file if something goes wrong. Also you can use the backed up state file for finding needed ids when you have to import the resource.
Get Affected Resource:
How you can find all affected resources? After the upgrade the provider will not be able to parse the state file, if a resource contains changes - like i described in the question above. You will get an error for affected resources. Then you can check the changes for this affected resource in the upgrade guide of the provider - can be found in the provider register e.g. azurerm.
Terraform Configuration: Now you have to apply the changes for the affected resources in the terraform configuration modules before you can import them like described in the migration guide.
Remove Outdated Resource: Like described in the the migration guide you have to remove the outdated resource from the state file because it contains the old format of the resource. The new provider is not able to handle these resources from the state file. They must be re-imported with the new provider.
Import Removed Resource: After you removed the resource you have to re-import it also described in the migration guide. Check the terraform import documentation for better understanding and usage.
So How to handle resource changes after provider upgrade in terraform?
I don't think deleting the state file and then importing the resource and do changes in resources attribute based on when you require to upgrade the azurerm version is a feasible solution.
Terraform Registry already given update/notes for every resource when they did some changes on their upgrading version. Just like below example
we use azurerm_app_service for version ~2.x but for version ~3.0 and ~4.0 azurerm_linux_web_app and azurerm_windows_web_app resources instead.
Would suggest you check the terraform registry for update on particular resources attribute for specific provider version or not and do it accordingly.

state snapshot was created by Terraform v0.12.29, which is newer than

I'm using Terraform with s3 as backend, every worked great before but just recently i got the following error message when running terraform plan or apply
Error: state snapshot was created by Terraform v0.14.8, which is newer than current v0.12.29; upgrade to Terraform v0.14.8 or greater to work with this state
The strange thing is I already forced the Terraform version:
terraform {
required_version = ">= 0.12"
}
And when I pulled the latest state from s3,the version is still 0.12.29.
terraform state pull | grep version
"terraform_version": "0.12.29",
....
I really have no idea where the version 0.14.8 comes from.
Happened to me, my deployment (CICD) failed and left a lock in the TF state.
So I just went and manually removed the lock from my local
terraform init -backend-config="key=prod/app1.tfstate"
terraform force-unlock -force xxxxx-8df6-a7e8-46a8-xxxxxxxxxxxx
Then I try to redeploy from CI/CD and get that error, because my local was higher version than terraform running in the CI/CD.
In the end, I did this to restore to the previous state:
S3: find the state file, restore the old version (versioning enabled in this bucket)
Run this again: terraform init -backend-config="key=prod/app1.tfstate" -reconfigure
To get the response below with the previous digest value:
Successfully configured the backend "s3"! Terraform will automatically
use this backend unless the backend configuration changes.
Error refreshing state: state data in S3 does not have the expected content.
This may be caused by unusually long delays in S3 processing a previous state
update. Please wait for a minute or two and try again. If this problem
persists, and neither S3 nor DynamoDB are experiencing an outage, you may need
to manually verify the remote state and update the Digest value stored in the
DynamoDB table to the following value: ebf597a8a25619b959baaa34a7b9d905
Update the dynamo item with the digest above
Run deployment again
Are you the only developer working on terraform?
Are you running terraform locally or also via some pipeline?
There is a strong possibility that one of your team member upgraded their terraform binary to the v0.14.8 version and applied locally (without updating the remote state) and now you would need to upgrade to that version as well
Its not just the version of the terraform state that you are accessing/running plan against. Terraform cross-references a lot of terraform states internally. So just go inside the remote state bucket and try to find that one specific remote state with different tf version.
make sure versioning is enable for AWS bucket which maintaining your tfstate files in AWS.
by enabling (show version / view) versioning inside bucket i found tfstate file by name.
Deleted the latest version which causes mismatch (as in my case it is for terraform version), it add delete marker file for that version. means it actually backup after deletion. you can easily restore original file back by just deleting this added delete marker file.)
then i looked into old versions of tfstate files to restore back, by checking history of deployment, downloaded required one (after download ca see details, for me its checking terraform correct version match)
then uploaded that old tfstate file to the same location from where i deleted conflicted tfstate file.
on resume deployment was getting error like below.
Error refreshing state: state data in S3 does not have the expected content.
This may be caused by unusually long delays in S3 processing a previous state
update. Please wait for a minute or two and try again. If this problem
persists, and neither S3 nor DynamoDB are experiencing an outage, you may need
to manually verify the remote state and update the Digest value stored in the
DynamoDB table to the following value: b55*****************************
which means there is digest value already present for previous tfstate lock file which need update with this new value, found in DynamoDB>table>view table details.
on resume deployment in spinnaker able to complete deployment ( exceptional case : but in my case the latest pipeline was included changes in which there was destroying unused resource, which was created using different provider, hence I required to first revert back the provider first then on resume I able to successfully deploy the changes.)

Force Terraform to install providers from local disk only, disabling Terraform Registry

Since 1995, we have used an update mechanism which
cleanly updates and removes software
centrally stores all software meta-data internally to manage needs and artifacts from a single source of truth
NEVER triggers itself arbitrarily.
While we understand terraform has begun reaching out to a registry in a brave reinvention of that wheel without any of those features, we wish to disable it completely. Our current kit includes only one plugin:
terraform-0.13.0-1.el7.harbottle.x86_64
golang-github-terraform-provider-vsphere-1.13.0-0.1.x86_64
The goal is
never check the registry
return an error if the given module is not installed
and I'd be very grateful for good suggestions toward that end. Is there a setting I've overlooked, or can we fake it by telling it to look somewhere empty? Is there a -stay-in-your-lane switch?
Clarification:
the add-on package is a go-build package which delivers a single artifact /usr/bin/terraform-provider-vsphere and nothing else. This has worked wonderfully for all previous incarnations and may have only begun to act up in v13.
Update: These things failed:
terraform init -plugin-dir=/dev/shm
terraform init -get-plugins=false
terraform init -get=false
setting terraform::required_providers::vsphere::source=""
echo "disable_checkpoint = true" > ~/.terraformrc
$ terraform init -get-plugins=false
Initializing the backend...
Initializing provider plugins...
- Finding latest version of -/vsphere...
- Finding latest version of hashicorp/vsphere...
Update: I'm still a bit off:
rpm -qlp golang-github-terraform-provider-vsphere
/usr/share/terraform/plugins/registry.terraform.io/hashicorp/vsphere/1.14.0/linux_amd64/terraform-provider-vsphere
I feel I'm really close. /usr/share/ is in the XDG default search path, and it DOES seem to find the location, but it seems to check the registry first/at-all, which is unexpected.
Initializing provider plugins...
- Finding latest version of hashicorp/vsphere...
- Finding latest version of -/vsphere...
- Installing hashicorp/vsphere v1.14.0...
- Installed hashicorp/vsphere v1.14.0 (unauthenticated)
Error: Failed to query available provider packages
Are we sure it stops checking if it has something local, and that it does that by default? Did I read that right?
What you are describing here sounds like the intention of the Provider Installation settings in Terraform's CLI configuration file.
Specifically, you can put your provider files in a local filesystem directory of your choice -- for the sake of this example, I'm going to arbitrarily choose /usr/local/lib/terraform, and then write the following in the CLI configuration file:
provider_installation {
filesystem_mirror {
path = "/usr/local/lib/terraform"
}
}
If you don't already have a CLI configuration file, you can put this in the file ~/.terraformrc.
With the above configuration, your golang-github-terraform-provider-vsphere-1.13.0-0.1.x86_64 package would need to place the provider's executable at the following path (assuming that you're working with a Linux system):
/usr/local/lib/terraform/registry.terraform.io/hashicorp/vsphere/1.30.0/linux_amd64/terraform-provider-vsphere_v1.13.0_x4
(The filename above is the one in the official vSphere provider release, but if you're building this yourself from source then it doesn't matter what exactly it's called as long as it starts with terraform-provider-vsphere.)
It looks like you are in the process of completing an upgrade from Terraform v0.12, and so Terraform is also trying to install the legacy (un-namespaced) version of this provider, -/vsphere. Since you won't have that in your local directory the installation of that would fail, but with the knowledge that this provider is now published at hashicorp/vsphere we can avoid that by manually migrating it in the state, thus avoiding the need for Terraform to infer this automatically on the next terraform apply:
terraform state replace-provider 'registry.terraform.io/-/vsphere' 'registry.terraform.io/hashicorp/vsphere'
After you run this command your latest state snapshot will not be compatible with Terraform 0.12 anymore, so if you elect to abort your upgrade and return to 0.12 you will need to restore the previous version from a backup. If your state is not stored in a location that naturally retains historical versions, one way to get such a backup is to run terraform state pull with a Terraform 0.12 executable and save the result to a file. (By default, Terraform defers taking this action until terraform apply to avoid upgrading the state format until it would've been making other changes anyway.)
The provider_installation configuraton above is an answer if you want to make this true for all future use of Terraform, which seems to be your goal here, but for completeness I also want to note that the following command should behave in an equivalent way to the result of the above configuration if you want to force a local directory only for one particular invocation of terraform init:
terraform init -plugin-dir=/usr/local/lib/terraform
Since you seem to be upgrading from Terraform 0.12, it might also interest you to know that Terraform 0.13's default installation behavior (without any special configuration) is the same as Terraform 0.12 with the exception of now expecting a different local directory structure than before, to represent the hierarchical provider namespace. (That is, to distinguish hashicorp/vsphere from a hypothetical othernamespace/vsphere.)
Specifically, Terraform 0.13 (as with Terraform 0.12) will skip contacting the remote registry for any provider for which it can discover at least one version available in the local filesystem.
It sounds like your package representing the provider was previously placing a terraform-provider-vsphere executable somewhere that Terraform 0.12 could find and use it. You can adapt that strategy to Terraform 0.13 by placing the executable at the following location:
/usr/local/share/terraform/plugins/registry.terraform.io/hashicorp/vsphere/1.30.0/linux_amd64/terraform-provider-vsphere_v1.13.0_x4
(Again, the exact filename here isn't important as long as it starts with terraform-provider-vsphere.)
/usr/local/share here is assuming one of the default data directories from the XDG Base Directory specification, but if you have XDG_DATA_HOME/XDG_DATA_DIRS overridden on your system then Terraform should respect that and look in the other locations you've listed.
The presence of such a file, assuming you haven't overridden the default behavior with an explicit provider_installation block, will cause Terraform to behave as if you had written the following in the CLI configuration:
provider_installation {
filesystem_mirror {
path = "/usr/local/share/terraform/plugins"
include = ["hashicorp/vsphere"]
}
direct {
exclude = ["hashicorp/vsphere"]
}
}
This form of the configuration forces local installation only for the hashicorp/vsphere provider, thus mimicking what Terraform 0.12 would've done with a local plugin file terraform-provider-vsphere. You can get the more thorough behavior of never contacting remote registries with a configuration like the one I opened this answer with, which doesn't include a direct {} block at all.

How do you fix Terraform unsupported attribute "ses_smtp_password" after upgrading to 0.13?

After upgrading I was getting messages like the following when running a terraform plan:
Error: Invalid resource instance data in state
on iam_server_backup.tf line 4:
4: resource "aws_iam_access_key" "backup" {
Instance aws_iam_access_key.backup data could not be decoded from
the state: unsupported attribute "ses_smtp_password".
The way I fixed it was by removing the state (terraform state rm aws_iam_access_key.backup). However, that created new access keys when I ran terraform apply, which was time-consuming because I had to change all my access keys in all my apps. Is there a better way to fix this issue?
I encountered this same unsupported attribute "ses_smtp_password" issue when I updated my aws terraform provider, and was able to fix it by manually downloading and modifying the state.
terraform state pull > state.json
now edit state.json and
remove any line with ses_smtp_password
increment the serial attribute (ex "serial": 21, -> "serial": 22,)
save
terraform state push state.json
terraform plan # should work again
optional, but makes it so you can't accidentally commit your state file
rm state.json
I too just faced this issue and wanted to provide a solution that does not require mucking with states.
The solution to this is to upgrade the AWS provider to ~> 3.0 before upgrading to terraform 0.13. This caused the ses_smtp_password field to be removed from the state which then allowed upgrading to terraform 0.13 possible without issue.
Unfortunately I do not understand how this worked and am guessing there is a change in TF 0.13 which causes an exception over removing the deprecated property since TF 0.12 was using the same version of the provider that 0.13 was using.
This error is not related to your upgrade to Terraform 0.13, it is actually due to upgrading from version 2.x of the AWS Terraform provider to version 3.x. As stated in the AWS Terraform provider 3.0 upgrade guide you need to switch from using ses_smtp_password to ses_smtp_password_v4.
The reason for this change is that SES will stop accepting the older type of password in October 2020, so you have to upgrade to the password that uses version 4 signatures before then.
As you've seen, you need to delete the old passwords from your Terraform state, and let Terraform generate new ses_smtp_password_v4 passwords.
I had the same problem with azurerm provider. It was pinned to version ~> 1.44 and suddenly after update to terraform 0.13, unsupported attribute errors started to appear.
What solved it for me (without the need to upgrade the provider):
Run terraform 0.13upgrade - this will create versions.tf with provider version contraints in the new format, i.e.:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 1.44"
}
}
required_version = ">= 0.13"
}
Now, change source = "hashicorp/azurerm" to source = "-/azurerm". This should make terraform use the old provider version.
For AWS provider, you should have something very similar, with different provider name and version.

Terraform CLI version lockdown

As some of you might already know, when you execute Terraform CLI it really is finicky about what Terraform version that was used to execute the last plan.
For example I am on Terraform v0.11.8. I haven't updated my Terraform in awhile. A new employee comes onboard and installs the latest Terraform without knowing that our infrastructure is deployed using v0.11.8. The new hire mistakenly executes Terraform with their new version and now I won't be able to execute with my v0.11.8.
Is there a feature in Terraform CLI that reads in the state file, sees that it was executed on v0.11.8, and tells the caller that they're not on the correct version?
You can use the required_version parameter in the terraform configuration block:
terraform {
required_version = "=0.11.8"
}

Resources