What happen when removing resources from terraform configuration file - terraform

I am trying to find my way in terraform , i am going through the documentation and got confused about what is exactly will happen if a resource has been deleted manually from the configuration file then we ran the apply command on the modified configuration file please ?
my understanding that the state file will still have the deleted resource as well it will still be actually running on the cloud platform so terraform apply will not perform any action, but i am not sure.
Appreciate if you help to clear my understanding please
Also another relevant point please what if a resource was changed manually from the cloud console for example and we tried to do any action from terraform on that resouces , what will happen ?
Thanks a lot ,

First, some background from the documentation at https://www.terraform.io/intro/index.html
Terraform generates an execution plan describing what it will do to reach the desired state, and then executes it to build the described infrastructure. As the configuration changes, Terraform is able to determine what changed and create incremental execution plans which can be applied.
The state mentioned gets saved when resources are modified. When you add a resource, it will get created and the state will be updated to reflect this. The same is for removing of resources. When you remove a resource in Terraform by deleting the code or template file, the resource will be removed and state updated to to reflect the removed resource (emphasis mine to illustrate the answer).
The second question of changing resources that drift from the state is a little more involved. When you create a plan against a resource that may have change, the provider will usually refresh the resources in state to compare them and show you what the changes to be made will be (ie. trying to change the resource to the declared state in the code).

Related

What does terraform apply/plan refresh-only do?

So I'm a bit confused on what terraform plan refresh-only is giving me. Essentially with just terraform plan it was saying it detected changes outside of terraform (that was me) and it was trying to "correct" these changes, sadly correcting these change requires the recreation of the resource. However if I add "refresh-only" after the plan, it removes that recreation and now says it will update the tfstate to match what changes I have done manually.
Is my understanding of this correct or are there things I'm missing?
A "normal" terraform plan includes two main behaviors:
Update the state from the previous run to reflect any changes made outside of Terraform. That's called "refreshing" in Terraform terminology.
Comparing that updated state with the desired state described by the configuration, and in case of any differences generating a proposed set of actions to change the real remote objects to match the desired state.
When you create a "refresh-only" plan, you're disabling the second of those, but still performing the first. Terraform will update the state to match changes made outside of Terraform, and then ask you if you want to commit that result as a new state snapshot to use on future runs. Typically the desired result of a refresh-only plan is for Terraform to report that there were no changes outside of Terraform, although Terraform does allow you to commit the result as a new state snapshot if you wish, for example if the changes cascaded from an updated object used as a data resource and you want to save those new results.
A refresh-only plan prevents Terraform from proposing any actions that would change the real infrastructure for that particular plan, but it does not avoid the need to deal with any differences in future plans. If the changes that Terraform is proposing are not acceptable then to move forward you will either need to change the configuration to match your actual desired state (for example, to match the current state of the object you don't want to replace) or change the real infrastructure (outside of Terraform) so it will match your configuration.

Terraform Refresh after manual change

So here's what I'm trying to do
Given I changed a configuration in the load balancer
And I added that to my terraform declaration
When I run a plan there are zero changes which is expected
Do I need to refresh at this point to match my hardware state before applying?
Or when I run an apply this would just update the state?
If you've changed the settings outside of Terraform and you've updated the Terraform configuration to match then indeed there's no extra step to run here: terraform plan should report that it detected the value changed outside of Terraform (assuming you're using Terraform v1.0.0 or later) but then report that it doesn't need to make any changes to match with the configuration.
Note also that in recent Terraform the terraform refresh command is still available but no longer recommended. Instead, you can use terraform apply -refresh-only to get a similar effect but with the opportunity to review the detected changes before creating a new state snapshot. In the situation you've described, a refresh-only apply like this will also allow you to commit the detected change as a new state snapshot so that future terraform plan won't re-report that it detected a change made outside of Terraform, which might avoid your coworkers being confused by this message when they make a later change.

Resolving broken deleted state in terraform

When terraform tries to deploy something and then times out in a state like pending or deleting the state will eventually update to successful or deleted but this never gets updated in the tf state so when I try to run something again it errors because the state doesn't match.
Error: error waiting for EC2 Transit Gateway VPC Attachment (tgw-attach-xxxxxxxxx) deletion: unexpected state 'failed', wanted target 'deleted'. last error: %!s(<nil>)
What is the correct way to handle this? Can I do something within terraform to get it to recognise the latest state in AWS? Is it a bug on tf's part?
tl; dr
It's probably less of a bug and more of a design choice.
You should investigate and if appropriate (e.g. the resource was created or deleted successfully and the state was not updated appropriately), you could either
run terraform refresh, which will cause Terraform to refresh its state file against what actually exists with the cloud provider
manually reconcile the situation by manipulating the Terraform state with the terraform state command, removing deleted resources or adding created resources
Detail
Unlike CloudFormation, Terraform's approach to 'failures' is to just drop everything and error out, leaving the operator to investigate the issue and attempt to resolve it themselves. As a result, operations which timeout are classed as failures and so the relevant resources are often not updated in Terraform's state.
Terraform does give us some recourse to handle this however. For one, we can manually manipulate Terraform's state file. We can add resources or remove resources from the state file as we like, though this should be done with caution.
We can also ask Terraform to 'refresh' its state, basically comparing the state file to reality. Implicitly this should remove resources which no longer exist, but it will not adopt resources into the state file which were provisioned outside of a successful Terraform run.
As an aside, timeouts relating to the interaction with any service provider, are a feature of the relevant Terraform Provider, in this case the AWS Provider. Only the Providers can expose configurable timeouts. For example, the AzureRM Provider does provide a means to configure timeouts, but it appears the AWS Provider does not.
Efforts are presumably made to incorporate sensible timeout values, but it's not unusual to see trivial operations take an age to complete properly.

Is there a way to reuse a terraform script and make changes to it?

I'm new to this terraform world and I've been assigned into the task of creating many configurations to azure with it.
I'm developing a main.tf script (which creates some resources, like resource group, vnets, kubernetes cluster, app services, etc.) and while coding it and executing
Terraform apply, it seems to only apply what changed doing in fact updates.
Then we deleted the resource group the script created and a colegue of mine had to run the same script with terraform creating a resource group with another name since i didn't had a required permission, after that, if i run the command Terraform apply it fails and gives errors, that say that the resource cannot be created because it already exists.
After reading some documentation i found that it might be because of the state
https://www.terraform.io/docs/state/index.html
Is the update of a script something that only works for each session of terraform?
Even doing a Terraform refresh doesn't seem to work.
Or probably I'm just mistaking and there is no way to update some resources.
EDIT: for some reason the state file that was on the storage only had a few things, the solution was to delete everything and create again.
For the new resources, there is nothing more, the Terraform script helps you create the resources you set in the script.
For the existing resources, when you make changes in the script that you already deployed via the Terraform, then it will check the state file to make sure what changes the resources should update. If there is no state file ( or you delete it), then it will deploy the Terraform script directly, but if any resources you want to deploy already exists, then it will fail due to the existing resources. And the command terraform refresh just updates the last state of the resources in the Terraform script that you already deployed. If the deployment failed and the state file has no resources in it, then refresh is not useful.
If someone else ran terraform apply for you because you didn't have access, and now you want to modify that terraform and run it yourself, you need to get the state file that was generated when that other person ran it. You absolutely have to maintain the Terraform state file somewhere, so that it can be accessed on subsequent runs. You should really configure a Terraform backend, instead of using local state files.
You need to be aware that Terraform stores everything it does in the state file, and refers to that file before every run. A terraform refresh only tells Terraform to refresh the state of the things that are in the state file, it doesn't rebuild the state file from scratch. Understanding Terraform state files is so fundamental to the use of Terraform that you really need to understand this before using it.

Backing up of Terraform statefile

I usually run all my Terraform scripts through Bastion server and all my code including the tf statefile resides on the same server. There happened this incident where my machine accidentally went down (hard reboot) and somehow the root filesystem got corrupted. Now my statefile is gone but my resources still exist and are running. I don't want to again run terraform apply to recreate the whole environment with a downtime. What's the best way to recover from this mess and what can be done so that this doesn't get repeated in future.
I have already taken a look at terraform refresh and terraform import. But are there any better ways to do this ?
and all my code including the tf statefile resides on the same server.
As you don't have .backup file, I'm not sure if you can recover the statefile smoothly in terraform way, do let me know if you find a way :) . However you can take few step which will help you come out from situation like this.
The best practice is keep all your statefiles in some remote storage like S3 or Blob and configure your backend accordingly so that each time you destroy or create a new stack, it will always contact the statefile remotely.
On top of it, you can take the advantage of terraform workspace to avoid the mess of statefile in multi environment scenario. Also consider creating a plan for backtracking and versioning of previous deployments.
terraform plan -var-file "" -out "" -target=module.<blue/green>
what can be done so that this doesn't get repeated in future.
Terraform blue-green deployment is the answer to your question. We implemented this model quite a while and it's running smoothly. The whole idea is modularity and reusability, same templates is working for 5 different component with different architecture without any downtime(The core template remains same and variable files is different).
We are taking advantage of Terraform module. We have two module called blue and green, you can name anything. At any given point of time either blue or green will be taking traffic. If we have some changes to deploy we will bring the alternative stack based on state output( targeted module based on terraform state), auto validate it then move the traffic to the new stack and destroy the old one.
Here is an article you can keep as reference but this exactly doesn't reflect what we do nevertheless good to start with.
Please see this blog post, which, unfortunately, illustrates import being the only solution.
If you are still unable to recover the terraform state. You can create a blueprint of terraform configuration as well as state for a specific aws resources using terraforming But it requires some manual effort to edit the state for managing the resources back. You can have this state file, run terraform plan and compare its output with your infrastructure. It is good to have remote state especially using any object stores like aws s3 or key value store like consul. It has support for locking the state when multiple transactions happened at a same time. Backing up process is also quite simple.

Resources