How to avoid empty "Objects have changed outside of Terraform"? - terraform

Context: even though this question is related to How to avoid "Objects have changed outside of Terraform"? it's not exactly the same.
I can't share my exact TF configuration but the idea is I'm getting an empty "Objects have changed outside of Terraform" warning message?
$ terraform plan
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply":
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to
these changes.
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
No changes. Your infrastructure matches the configuration.
that doesn't display any potential changes.
When I copy my current state and then compare it against the new state after running terraform apply --auto-approve there're no changes either:
diff terraform.tfstate old.tfstate
4c4
< "serial": 25,
---
> "serial": 24,
217d216
< "data.foo.test",
219c218,219
< "data.bar.test2"
---
> "data.bar.test2",
> "data.bar.test2"
seems the only diff is ordering of resource in TF state. Is it TF bug or something?
$ terraform version
Terraform v0.15.4
Also found related issues on GitHub: https://github.com/hashicorp/terraform/issues/28776

This sort of odd behavior can occur in response to quirks in the way particular providers handle the "refresh" step. For backward compatibility with Terraform providers written using the older SDK designed for Terraform v0.11 and earlier, the plan renderer will suppress certain differences that tend to arise due to limitations/flaws in that SDK, such as a value being null before refresh but "" (empty string) after refresh.
Unfortunately if that sort of change is the only change then it can confuse Terraform in this way, where Terraform can see that there is a difference but then when it tries to render the difference it gets suppressed as legacy SDK noise and so there ends up being no difference to report.
This behavior was refined in later versions of Terraform, and so upgrading to the latest v1.x.y release may avoid the problem, assuming it's one of the quirks that the Terraform team already addressed.
I think the reason why you don't see any change to the state here could be that, since there were no changes to make in response to differences in your configuration, Terraform skipped the "apply" step and thus didn't actually commit the refreshed state. You can force Terraform to treat the refresh changes as something to be committed by creating and applying a refresh-only plan:
terraform apply -refresh-only
Assuming that the provider that's misbehaving here is consistent in how it misbehaves (that is: once it's returned a marginally different value, if it will keep returning that value), after applying the refresh only plan you would no longer see this message because future refreshes of the same object will not detect any such immaterial differences.

Related

How should Terraform provider handle resource error when it consists of multiple entities?

NOTE: I'm using the v2 SDK.
In my provider my 'resource' isn't a single API call.
My resource is actually multiple 'things'.
For example...
resource "my_resource" "example" {
foo {
...
}
bar {
...
}
baz {
...
}
}
The resource and each of the nested blocks are all separate 'things' that each have their own API calls.
So when 'creating' this resource I need to actually make multiple API calls. One API call to create the resource itself, then I need to make an API call to create a 'foo', then another API for 'bar', 'baz' etc. Finally, once those nested things are created I need to call my API one last time to activate my main resource.
The problem I've found is that if there's an error in the creation of one of the nested blocks, I'm finding the state is getting messed up and reflecting the 'planned' diff even though I return an error from the API call as part of the Create step.
I'm interested to know how other people are handling errors in a provider that has a structure like this?
I've tried using Partial(). I've also tried to trigger another Read of each 'thing' but although the final state data looks to be correct (when printing it as part of a debug run with trace logs), once I've done a read, because my 'Create' function has to return an error, the state data that's read is dropped and the original planned diff is persisted (I've even stopped returning an error altogether and tried to return just the result of the Read, which is successful, and STILL the state reflects the planned diff rather than the modified state after a Read).
Since you mentioned Partial I'm assuming for this answer that you are using the older SDKv2 rather than the modern Terraform provider framework.
The programming model for SDKv2 is for the action functions like Create to receive a mutable value representing the planned values, encapsulated in a schema.ResourceData object, and then the provider will modify that value through that wrapping object to make it describe the object that was actually created (or updated).
Terraform Core itself expects a provider to respond to the "apply" request by returning the closest possible representation of what was actually created in the remote system. If the value is returned without an error then Terraform will require that the object conforms to the plan and will raise an error saying that there's a bug in the provider if not. If the provider returns a value and an error then Terraform Core will propagate that error to the user and save whatever value was returned, as long as it matches the schema of the resource type.
Unfortunately this mismatch in models between Terraform Core and the SDK makes the situation you've described quite tricky: if you don't call d.Set at all in your Create function then by default the SDK will just return whatever values were in the plan, even if parts of it weren't actually created yet. To make your provider behave in the way that Terraform is expecting you'd need to do something like this:
At the beginning of Create, decode all of the nested block data into some local variables of data types that are useful for making the API calls you intend to make. For example, you might at this step decode the data from the ResourceData object into whatever struct types the underlying SDK expects.
Before you take any other actions, use d.Set to remove all of the blocks of the types that will require separate requests each. This means you'll need to pass an empty value of whatever type is appropriate for the type you chose for that block's value.
In your loop where you're gradually creating the separate objects that each block represents, gradually append the results into a growing set of objects representing the blocks you've already successfully created. Each time you add a new item to that set, call d.Set again to reset the attribute representing the appropriate block type to now include the object that you created.
If you get to the end without any errors then your attributes should now again describe all of the objects requested in the configuration and you can return without an error. If you encounter an error partway through then you can return that error and the SDK will automatically also return the partially-updated value encapsulated inside the ResourceData object.
If you return an accurate description of which of the blocks were created and exclude the ones that weren't then on the next plan the SDK logic should notice that some of the blocks declared in the configuration aren't present in the prior state and so it should propose to update the object to include those additional blocks. Your provider's Update function can then follow a similar principle as above to gradually append only the nested objects it successfully created, so that it'll once again return a complete set if successful or a partial set in case of any errors.
SDKv2 was optimized for the more common case where a single resource block represents a single remote API call, and so its default behavior deals with either fully-successful or fully-failed responses. Dealing with partial failure requires more subtlety that is difficult to represent in that SDK's API.
The newer Terraform Plugin Framework has a different design for these operations which separates the request data from the response data, thereby making it less confusing to return only a partial result. The Resource interface has a Create method which has a request object containing the config and the plan and a response object containing a representation of the final state.
It pre-populates the response state with the planned values similarly to SDKv2 to still handle that common case of entirely-failing vs. entirely-succeeding, but it does also allow totally overwriting that default with a locally-constructed object representing a partial result, to better support situations like yours where one resource in Terraform is representing a number of different fallible calls to the underlying API.

How to defer "file" function execution in puppet 3.7

This seems like a trivial question, but in fact it isn't. I'm using puppet 3.7 to deploy and configure artifacts from my project on to a variety of environments. Puppet 5.5 upgrade is on the roadmap, but without an ETA so far.
One of the things I'm trying to automate is the incremental changes to the underlying db. It's not SQL so standard tools are out of question. These changes will come in the form of shell scripts contained in a special module, that will also be deployed as an artifact. For each release we want to have a file, whose content will list the shell scripts to execute in scope of this release. For instance, if in version 1.2 we had implemented JIRA-123, JIRA-124 and JIRA-125, I'd like to execute scripts JIRA-123.sh, JIRA-124.sh and JIRA-125.sh, but no other ones that will still be in that module from previous releases.
So my release "control" file would be called something like jiras.1.2.csv and have one line looking like this:
JIRA-123,JIRA-124,JIRA-125
The task for puppet here seems trivial - read the content of this file, split on "," character, and go on to build exec tasks for each of the jiras. The problem is that the puppet function that should help me do it
file("/somewhere/in/the/filesystem/jiras.1.2.csv")
gets executed at the time of building the puppet catalog, not at the time when the catalog is applied. However, since this file is a part of the payload of the release, it's not there yet. It will be downloaded from nexus in a tar.gz package of the release and extracted later. I have an anchor I can hold on to, which I use to synchronize the exec tasks, but can I attach the reading of the file content to that anchor?
Maybe I'm approaching the problem incorrectly? I was thinking the module with the pre-implementation and post-implementation tasks that constitute the incremental db upgrades could be structured so that for each release there's a subdirectory matching the version name, but then I need to list the contents of that subdirectory to build my exec tasks, and that too - at least to my limited puppet knowledge - can't be deferred until a particular anchor.
--- EDITED after one of the comments ---
The problem is that the upgrade to puppet 5.x is beyond my control - it's another team handling this stuff in a huge organisation, so I have no influence over that and I'm stuck on 3.7 for the foreseeable future.
As for what I'm trying to do - for a bunch of different software packages that we develop and release I want to create three new ones: pre-implementation, post-implementation and checks. The first will hold any tasks that are performed prior to releasing new code in our actual packages. This is typically things like backing up db. Post-implementation will deal with issues that need to be addressed after we've deployed the new code - example operation would be to go and modify older data because for instance we've changed a type of column in a table. Checks are just validations performed to make sure the release is 100% correctly implemented - for instance run a select query and assert on the type of data in the column, whose type we've just changed. Today all of these are daunting manual operations performed by whoever is unlucky to be doing a release. Above all else though, being manual these are by definition error prone.
The approach taken is that for every JIRA ticket being part of the release the responsible developer will have to decide what steps (if any) are needed to release their work, and script that. Puppet is supposed to orchestrate the execution of all of this.

Update State to Make Converting Plan to module easier?

Can I modify state to convince terraform to only report actual changes and not scope changes?
Goal
Easily verify conversion of plan to module by reducing the delta.
Process
Create a module from a plan
Modify plan to use module passing all its required variables
Run terraform plan and expect NO changes
spoiler: all items in the plan get destroyed and recreated as the shift in scope.
- myplan.aws_autoscaling_group.foo
+ module.mymodule.aws_autoscaling_group.foo
Possible Solution
If I could update state as if I ran an apply without changing infrastructure, then I could run a plan and see just the difference between the actual infrastructure and my plan with not scope changes.
terraform state list
for each item
terraform state mv <oldval> module.mymodule.<oldval>
This works, but I'm moving from a plan that uses another module to a module that uses a module and mv on a module doesn't seem to work.
terraform state mv module.ecs_cluster module.mymodule.ecs_cluster
Error moving state: Unexpected value for InstanceType field: "ecs_cluster"
Please ensure your addresses and state paths are valid. No
state was persisted. Your existing states are untouched.

Clean up dependancies after Destroy AND scheduel

One can (or should) also target a module and not the whole infrastructure while using destroy Something like this.
terraform destroy -target=module.<module_name>
or better see what you are going to destroy first
terraform plan -destroy -target=module.<module_name>
I agree with above comment.
how do I schedule to destroy after certain condition?
How do I destroy (or reflect changes to) other dependent modules?

How Terraform incremental changes should be organized?

could you help me to understand
For example I have incremental changes to infrastructure, like [A] -> [B] -> [C], where [A] separately can be one server named i, [B] separately can be second server named j, and [C] separately can be third server named k. In total there should be 3 servers. Every state can be described as [A] = x, x + [B] = y, y + [C] = z where x, y, z are states in the middle.
My question are
How to organize incremental infrastructure changes for multiple modules in Terraform without affecting previous modules?
Is it possible to rollback changes in the middle of the chain eg. [B] and get x-state or we should follow chain from the last module [C] to required in the middle [B]?
At this time Terraform only considers two states[1]: the "current state" (the result of the previous Terraform run along with any "drift" in the mean time) and the "desired state" (described in configuration). Terraform's task is to identify the differences between these two states and determine which API actions are needed to move resources from their current state to the desired state.
This means that any multi-step transition cannot be orchestrated within Terraform alone. In your example, to add server j you would add it alone to the Terraform configuration, and run Terraform to create that server. You can then add server k to the configuration and run Terraform again. To automate this, an external process would need to generate these progressive configuration changes.
An alternative approach -- though not recommended for everyday use, since it can cause confusion in a collaborative environment where others can't easily see how this state was reached -- is to use the -target argument to specify one or more specific resources to operate on. In principle this allows adding both servers j and k to configuration but then using -target with only the resource representing j.
There is no formal support for rollback in Terraform. Terraform merely sees this as another state transition, "rolling forward". For example, if after creating server k you wish to revert to state [A], you would remove the configuration for server k (by reverting in version control, perhaps) and apply again. Terraform doesn't have any awareness of the fact that this is a "rollback", but it can see that the configuration no longer contains server k and thus know that it needs to be destroyed to reach the desired state.
One of your questions is about "affecting previous modules". In general, if no changes are made to a resource in your configuration (either the config changed, or the resource itself changed outside of Terraform's purview) then Terraform should not try to update it. If it did, that would be considered a bug. However, for larger systems it can be useful to split infrastructure across multiple top-level Terraform configurations that are each applied separately. If a remote backend is in use (recommended for collaborative environments) then the terraform_remote_state data source can be used to access the output values of one configuration from within another, thus allowing the creation of a tree or DAG of Terraform configurations. This adds complexity, so should be weighed carefully, but has the advantage of decreasing the "blast radius" of a particular change by taking unrelated resources out of consideration altogether.
[1] I am using "state" here in the general sense you used it, which is distinct from Terraform's physical idea of "state", a data file that contains a record of which resources existed at the last run.

Resources