Update State to Make Converting Plan to module easier? - terraform

Can I modify state to convince terraform to only report actual changes and not scope changes?
Goal
Easily verify conversion of plan to module by reducing the delta.
Process
Create a module from a plan
Modify plan to use module passing all its required variables
Run terraform plan and expect NO changes
spoiler: all items in the plan get destroyed and recreated as the shift in scope.
- myplan.aws_autoscaling_group.foo
+ module.mymodule.aws_autoscaling_group.foo
Possible Solution
If I could update state as if I ran an apply without changing infrastructure, then I could run a plan and see just the difference between the actual infrastructure and my plan with not scope changes.
terraform state list
for each item
terraform state mv <oldval> module.mymodule.<oldval>
This works, but I'm moving from a plan that uses another module to a module that uses a module and mv on a module doesn't seem to work.
terraform state mv module.ecs_cluster module.mymodule.ecs_cluster
Error moving state: Unexpected value for InstanceType field: "ecs_cluster"
Please ensure your addresses and state paths are valid. No
state was persisted. Your existing states are untouched.

Related

Difference between Data source and Output block in terraform

I am not able to figure out what's the difference between data source block and output block in terms of functionality because both are used for getting information about that resource from the console like id, public_ip etc. Can anyone please help me in understanding this because I could'nt find out a suitable resource for this
I have tried to search online for this difference but couldnt find the actual answer.
data essentially represents a dependency on an object that isn't managed by the current Terraform configuration but the current Terraform configuration still needs to make use of it. Mechanically that typically means making a Get or Read request to a specific API endpoint and then exporting the data from that API response in the resulting attributes.
output represents is one of the two ways that data can flow from one module into another. variable blocks represent data moving from the parent module into the child module, and output blocks represent data moving from the child module out to the parent.
There is no strong relationship between these two concepts but one way they sometimes connect is if you use the tfe_outputs data source belonging to the hashicorp/tfe provider, or if you use the terraform_remote_state data source from the terraform.io/builtin/terraform provider. Both of those data sources treat the output values from the root module of some other Terraform configuration as the external object to fetch, and so you can use these as one way to use the results from one configuration as part of another configuration, as long as the second configuration will be run in a context that has access to the state of the first.

How should Terraform provider handle resource error when it consists of multiple entities?

NOTE: I'm using the v2 SDK.
In my provider my 'resource' isn't a single API call.
My resource is actually multiple 'things'.
For example...
resource "my_resource" "example" {
foo {
...
}
bar {
...
}
baz {
...
}
}
The resource and each of the nested blocks are all separate 'things' that each have their own API calls.
So when 'creating' this resource I need to actually make multiple API calls. One API call to create the resource itself, then I need to make an API call to create a 'foo', then another API for 'bar', 'baz' etc. Finally, once those nested things are created I need to call my API one last time to activate my main resource.
The problem I've found is that if there's an error in the creation of one of the nested blocks, I'm finding the state is getting messed up and reflecting the 'planned' diff even though I return an error from the API call as part of the Create step.
I'm interested to know how other people are handling errors in a provider that has a structure like this?
I've tried using Partial(). I've also tried to trigger another Read of each 'thing' but although the final state data looks to be correct (when printing it as part of a debug run with trace logs), once I've done a read, because my 'Create' function has to return an error, the state data that's read is dropped and the original planned diff is persisted (I've even stopped returning an error altogether and tried to return just the result of the Read, which is successful, and STILL the state reflects the planned diff rather than the modified state after a Read).
Since you mentioned Partial I'm assuming for this answer that you are using the older SDKv2 rather than the modern Terraform provider framework.
The programming model for SDKv2 is for the action functions like Create to receive a mutable value representing the planned values, encapsulated in a schema.ResourceData object, and then the provider will modify that value through that wrapping object to make it describe the object that was actually created (or updated).
Terraform Core itself expects a provider to respond to the "apply" request by returning the closest possible representation of what was actually created in the remote system. If the value is returned without an error then Terraform will require that the object conforms to the plan and will raise an error saying that there's a bug in the provider if not. If the provider returns a value and an error then Terraform Core will propagate that error to the user and save whatever value was returned, as long as it matches the schema of the resource type.
Unfortunately this mismatch in models between Terraform Core and the SDK makes the situation you've described quite tricky: if you don't call d.Set at all in your Create function then by default the SDK will just return whatever values were in the plan, even if parts of it weren't actually created yet. To make your provider behave in the way that Terraform is expecting you'd need to do something like this:
At the beginning of Create, decode all of the nested block data into some local variables of data types that are useful for making the API calls you intend to make. For example, you might at this step decode the data from the ResourceData object into whatever struct types the underlying SDK expects.
Before you take any other actions, use d.Set to remove all of the blocks of the types that will require separate requests each. This means you'll need to pass an empty value of whatever type is appropriate for the type you chose for that block's value.
In your loop where you're gradually creating the separate objects that each block represents, gradually append the results into a growing set of objects representing the blocks you've already successfully created. Each time you add a new item to that set, call d.Set again to reset the attribute representing the appropriate block type to now include the object that you created.
If you get to the end without any errors then your attributes should now again describe all of the objects requested in the configuration and you can return without an error. If you encounter an error partway through then you can return that error and the SDK will automatically also return the partially-updated value encapsulated inside the ResourceData object.
If you return an accurate description of which of the blocks were created and exclude the ones that weren't then on the next plan the SDK logic should notice that some of the blocks declared in the configuration aren't present in the prior state and so it should propose to update the object to include those additional blocks. Your provider's Update function can then follow a similar principle as above to gradually append only the nested objects it successfully created, so that it'll once again return a complete set if successful or a partial set in case of any errors.
SDKv2 was optimized for the more common case where a single resource block represents a single remote API call, and so its default behavior deals with either fully-successful or fully-failed responses. Dealing with partial failure requires more subtlety that is difficult to represent in that SDK's API.
The newer Terraform Plugin Framework has a different design for these operations which separates the request data from the response data, thereby making it less confusing to return only a partial result. The Resource interface has a Create method which has a request object containing the config and the plan and a response object containing a representation of the final state.
It pre-populates the response state with the planned values similarly to SDKv2 to still handle that common case of entirely-failing vs. entirely-succeeding, but it does also allow totally overwriting that default with a locally-constructed object representing a partial result, to better support situations like yours where one resource in Terraform is representing a number of different fallible calls to the underlying API.

How to avoid empty "Objects have changed outside of Terraform"?

Context: even though this question is related to How to avoid "Objects have changed outside of Terraform"? it's not exactly the same.
I can't share my exact TF configuration but the idea is I'm getting an empty "Objects have changed outside of Terraform" warning message?
$ terraform plan
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply":
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to
these changes.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
No changes. Your infrastructure matches the configuration.
that doesn't display any potential changes.
When I copy my current state and then compare it against the new state after running terraform apply --auto-approve there're no changes either:
diff terraform.tfstate old.tfstate
4c4
< "serial": 25,
---
> "serial": 24,
217d216
< "data.foo.test",
219c218,219
< "data.bar.test2"
---
> "data.bar.test2",
> "data.bar.test2"
seems the only diff is ordering of resource in TF state. Is it TF bug or something?
$ terraform version
Terraform v0.15.4
Also found related issues on GitHub: https://github.com/hashicorp/terraform/issues/28776
This sort of odd behavior can occur in response to quirks in the way particular providers handle the "refresh" step. For backward compatibility with Terraform providers written using the older SDK designed for Terraform v0.11 and earlier, the plan renderer will suppress certain differences that tend to arise due to limitations/flaws in that SDK, such as a value being null before refresh but "" (empty string) after refresh.
Unfortunately if that sort of change is the only change then it can confuse Terraform in this way, where Terraform can see that there is a difference but then when it tries to render the difference it gets suppressed as legacy SDK noise and so there ends up being no difference to report.
This behavior was refined in later versions of Terraform, and so upgrading to the latest v1.x.y release may avoid the problem, assuming it's one of the quirks that the Terraform team already addressed.
I think the reason why you don't see any change to the state here could be that, since there were no changes to make in response to differences in your configuration, Terraform skipped the "apply" step and thus didn't actually commit the refreshed state. You can force Terraform to treat the refresh changes as something to be committed by creating and applying a refresh-only plan:
terraform apply -refresh-only
Assuming that the provider that's misbehaving here is consistent in how it misbehaves (that is: once it's returned a marginally different value, if it will keep returning that value), after applying the refresh only plan you would no longer see this message because future refreshes of the same object will not detect any such immaterial differences.

Clean up dependancies after Destroy AND scheduel

One can (or should) also target a module and not the whole infrastructure while using destroy Something like this.
terraform destroy -target=module.<module_name>
or better see what you are going to destroy first
terraform plan -destroy -target=module.<module_name>
I agree with above comment.
how do I schedule to destroy after certain condition?
How do I destroy (or reflect changes to) other dependent modules?

How Terraform incremental changes should be organized?

could you help me to understand
For example I have incremental changes to infrastructure, like [A] -> [B] -> [C], where [A] separately can be one server named i, [B] separately can be second server named j, and [C] separately can be third server named k. In total there should be 3 servers. Every state can be described as [A] = x, x + [B] = y, y + [C] = z where x, y, z are states in the middle.
My question are
How to organize incremental infrastructure changes for multiple modules in Terraform without affecting previous modules?
Is it possible to rollback changes in the middle of the chain eg. [B] and get x-state or we should follow chain from the last module [C] to required in the middle [B]?
At this time Terraform only considers two states[1]: the "current state" (the result of the previous Terraform run along with any "drift" in the mean time) and the "desired state" (described in configuration). Terraform's task is to identify the differences between these two states and determine which API actions are needed to move resources from their current state to the desired state.
This means that any multi-step transition cannot be orchestrated within Terraform alone. In your example, to add server j you would add it alone to the Terraform configuration, and run Terraform to create that server. You can then add server k to the configuration and run Terraform again. To automate this, an external process would need to generate these progressive configuration changes.
An alternative approach -- though not recommended for everyday use, since it can cause confusion in a collaborative environment where others can't easily see how this state was reached -- is to use the -target argument to specify one or more specific resources to operate on. In principle this allows adding both servers j and k to configuration but then using -target with only the resource representing j.
There is no formal support for rollback in Terraform. Terraform merely sees this as another state transition, "rolling forward". For example, if after creating server k you wish to revert to state [A], you would remove the configuration for server k (by reverting in version control, perhaps) and apply again. Terraform doesn't have any awareness of the fact that this is a "rollback", but it can see that the configuration no longer contains server k and thus know that it needs to be destroyed to reach the desired state.
One of your questions is about "affecting previous modules". In general, if no changes are made to a resource in your configuration (either the config changed, or the resource itself changed outside of Terraform's purview) then Terraform should not try to update it. If it did, that would be considered a bug. However, for larger systems it can be useful to split infrastructure across multiple top-level Terraform configurations that are each applied separately. If a remote backend is in use (recommended for collaborative environments) then the terraform_remote_state data source can be used to access the output values of one configuration from within another, thus allowing the creation of a tree or DAG of Terraform configurations. This adds complexity, so should be weighed carefully, but has the advantage of decreasing the "blast radius" of a particular change by taking unrelated resources out of consideration altogether.
[1] I am using "state" here in the general sense you used it, which is distinct from Terraform's physical idea of "state", a data file that contains a record of which resources existed at the last run.

Resources