Difference between local values and null_data_source for intermediate values on Terraform - terraform

I have a situation where I need to store some intermediate values so I can reuse them in other parts of the root module. I know about local values and I know about null_data_source except I do not know which one is the recommended option for holding re-usable values. Both descriptions look somewhat similar to me
local values (https://www.terraform.io/docs/configuration/locals.html)
Local values can be helpful to avoid repeating the same values or expressions multiple times in a >configuration, but if overused they can also make a configuration hard to read by future >maintainers by hiding the actual values used.
and null_data_source (https://www.terraform.io/docs/providers/null/data_source.html)
The primary use-case for the null data source is to gather together collections of intermediate >values to re-use elsewhere in configuration:
So both appear to be a valid choice for this scenario.
Here is my example code
locals {
my_string_A = "This is string A"
}
data "null_data_source" "my_string_B" {
inputs = {
my_string_B = "This is string B"
}
}
output "my_output_a" {
value = "${local.my_string_A}"
}
output "my_output_b" {
value = "${data.null_data_source.my_string_B.outputs["my_string_B"]}"
}
Could you suggest on when to use the one over the other for holding intermediate values and what is the pros/cons of each approach?
Thank you

The null_data_source data source was introduced prior to the local values mechanism as an interim solution to meet that use-case before that capability became first-class in the language. It continues to be supported only for backward-compatibility with existing configurations using it.
All new configurations should use the Local Values mechanism instead. It's fully integrated into the Terraform language, supports values of any type (while null_data_source can support only strings), and has a much more concise/readable syntax.

Related

Where do old,new := d.GetChange() come from in CustomizeDiff and DiffSuppressFunc?

There're 2 methods:
CustomizeDiff
DiffSuppressFunc
Corresponding objects (schema.ResourceDiff and schema.Resource) support old, new := d.GetChange("foo"), however I'm confused about where these values are coming from.
I've been thinking that
DiffSuppressFunc: func(k, old, new string, d *schema.ResourceData) bool
taked old from TF state and new from the result from running readResource(). What if there's 0 diff and then user changes main.tf -- is it old or new value?
and for CustomizeDiff:
old, new := d.GetChange("foo")
it seems like new is from TF state / main.tf but old is from readResource().
Where can I read about it more? I was always thinking that TF state is old and then response is new -- when looking at output from TF drift.
The DiffSuppressFunc abstraction in this old Terraform SDK is unfortunately one of the parts that still retains some outdated assumptions from older versions of Terraform, since it was those older versions that this SDK was originally designed to serve.
Specifically, in Terraform v0.11 and earlier the model of resource state and plan data was a flat map from strings to strings and the SDK internally translated between that and the heirarchical structures described in the schema. Under this model, a list in the provider schema serializes as a bunch of separate entries in the flat map, like example.# giving the number of elements, example.0 giving the first element, etc.
DiffSuppressFunc is one place where that internal implementation detail leaked up into the API, because "diff suppressing" is an operation done against the already-flattened data structure that's describing the changes, and so the schema type information has all been lost.
You shouldn't typically need to worry about exactly what old and new mean because the purpose of DiffSuppressFunc is only to determine whether the two values are functionally equivalent. The function only needs to compare the two and return true if they represent alternative serializations of the same information.
However, if you're curious about the implementation details then you can review the part of the SDK which calls this function.
CustomizeDiff's behavior is more specialized than DiffSuppressFunc, because it's used for one purpose and one purpose only: adding special behaviors to run during Terraform's "plan" step.
In this case then the old value is always the most recent known value for a particular argument, and the new value starts off being a value from the current configuration but you can override it using SetNew or SetNewComputed methods of ResourceDiff.
To emulate what might normally be done by a CustomizeDiff you'd write logic something like this:
old, new := d.GetChange("foo")
if functionallyEquivalent(old, new) {
d.SetNew(old)
}
The definition of functionallyEquivalent is for you to write based on your knowledge of the system which you are wrapping with this provider. If foo is a string attribute then you can use type assertions like old.(string) and new.(string) to get the actual string values to compare.
SDKv2 is essentially a legacy system at this point, designed around the behaviors of an obsolete version of Terraform. It's still available primarily to support existing providers which were themselves originally written for those obsolete versions of Terraform.
The new Terraform Plugin Framework is built for modern Terraform and so has fewer "gotchas" resulting from inconsistencies between how the SDK works and how Terraform itself works.
The modern equivalent of CustomizeDiff in the plugin framework is plan modification, and a plan modifier for a string attribute would be an implementation of planmodifier.String.
The new API makes it a bit more explicit where all of these values are coming from: the StringRequest type differentiates between the value from the configuration, the value from the prior state, and the value from the proposed new state, which is the framework's initial attempt to construct a plan prior to any custom modifications in the provider.
Therefore a plan modifier for normalizing a string attribute in a similar manner to DiffSuppressFunc in the old SDK would be:
func (m ExampleStringModifier) PlanModifyString(ctx context.Context, req StringRequest, resp *StringResponse) {
if functionallyEquivalent(req.StateValue, req.PlanValue) {
// Preserve the value from the prior state if the
// new value is equivalent to it.
resp.PlanValue = req.StateValue
}
}
Again you'll need to define and implement the exact rule for what functionallyEquivalent means for this particular attribute.

How can I implement it using DiffSuppressFunc()?

Context: I'm developing a TF Provider.
There's an attribute foo of type string in one of my resources. Different representations of values of foo can map to the same normalized version but only backend can return a normalized version of value of foo.
When implementing the resources, I was thinking I could store any user value for foo (i.e., it's not necessarily normalized). And then I could leverage DiffSuppressFunc to detect any potential differences. For example, main.tf stores any user input (by definition), TF state could store either normalized version return from a backend or user input version (don't matter a lot). And then, the biggest challenge is to differentiate between structural update (requires an update) and syntactic update (doesn't require update since it can be converted to the same normalized version).
In order to implement this I could use
"foo": {
...
DiffSuppressFunc: func(k, old, new string, d *schema.ResourceData) bool {
// Option #1
normalizedOld := network.GetNormalized(old)
normalizedNew := network.GetNormalized(new)
return normalizedOld == normalizedNew
// Option #2
// Backend also supports a check whether such a value exists already
// and returns such a object
if obj, ok := network.Exists(new); ok { return obj.Id == d.GetObjId(); }
}
}
However it seems like I can't send network requests in DiffSuppressFunc since it doesn't accept meta interface{} from:
func resourceCreate(ctx context.Context, d *schema.ResourceData, meta interface{})
So I can't access my specific http client (even though I could send some generic network request).
Is there a smart way to avoid this limitation to pass meta interface{} to DiffSuppressFunc:
// The interface{} parameter is the result of the Provider type
// ConfigureFunc field execution. If the Provider does not define
// a ConfigureFunc, this will be nil. This parameter is conventionally
// used to store API clients and other provider instance specific data.
//
// The diagnostics return parameter, if not nil, can contain any
// combination and multiple of warning and/or error diagnostics.
ReadContext ReadContextFunc
The intention for DiffSuppressFunc is that it be only syntactic normalization that doesn't rely on information from outside of the provider. A DiffSuppressFunc should not typically interact with anything outside of the provider because the SDK can call it at various steps and expects it to return a consistent result each time, rather than varying based on the state of the remote system.
If you need to rely on information from the remote system then you'll need to implement the logic you're discussing in the CustomizeDiff function instead. That function is the lowest level of abstraction for diff customization in the SDK but in return for the low level of abstraction it also allows more flexibility than the higher-level built-in behaviors in the SDK.
In the CustomizeDiff function you will have access to meta and so you can make API requests if you need to.
Inside your CustomizeDiff function you can use d.GetChange to obtain both the previous value and the new value from the configuration to use in the same way as the old and new arguments to DiffSuppressFunc.
You can then use d.SetNew to change the planned value for a particular attribute based on what you learned. To approximate what DiffSuppressFunc would do you would call d.SetNew with the value from the prior state -- the "old" value.
When implementing CustomizeDiff you must respect the consistency rules that apply to all Terraform providers, which include:
When planning initial creation of an object, if the module author provided a specific value in the configuration then you must preserve exactly that value, without normalization.
When planning an update to an existing object, if the module author has provided a specific value in the configuration then you must return either the exact value they wrote without normalization or return exactly the value from the prior state to indicate that the new configuration value is functionally equivalent to the previous value.
When implementing Read there is also a similar consistency rule:
If the value you read from the remote system is not equal to what was in the prior state but the new value is functionally equivalent to the prior state then you must return the value from the prior state to preserve the way the author originally wrote it, rather than the way the remote system normalized it.
All of these rules exist to help ensure that a particular Terraform configuration can converge, which is to say that after running terraform apply it should be possible to immediately run terraform plan and see it report "No changes". If you don't stick to these rules then Terraform may return an explicit error (for problems it's able to detect) or it may just behave strangely due to the provider producing confusing information that doesn't match the assumptions of the protocol.

Shall I use block or attirbute when designing a Terraform resource?

Context: I'm developing a terraform provider.
I can see that some of the providers (like AWS) use an attribute (e.g., connection_id) when referencing ID:
resource "aws_dx_connection_confirmation" "confirmation" {
connection_id = "dxcon-ffabc123"
}
whereas others use blocks:
resource "aws_dx_connection_confirmation" "confirmation" {
connection {
id = "dxcon-ffabc123"
}
}
Is there a specific pattern around it? From what I can see,
Use block if there're mulitple kinda enum values (bar, bar_2) and only one of them can be specified:
resource "aws_foo" "temp" {
bar {
id = "dxcon-ffabc123"
}
// bar_2 {
// id = "abcde"
//}
}
Use block to group multiple related attributes:
resource "aws_devicefarm_test_grid_project" "example" {
name = "example"
vpc_config {
vpc_id = aws_vpc.example.id
subnet_ids = aws_subnet.example.*.id
security_group_ids = aws_security_group.example.*.id
}
}
Use block when there's a plan to add more attributes to the object that block represents:
resource "aws_dx_connection_confirmation" "confirmation" {
connection {
id = "dxcon-ffabc123"
// TODO: later on, `name` will be added as a second input option that could be used to identify connection instead of `id`
}
}
I found Attributes as Blocks doc but it's a bit confusing.
In general the direct comparison here is between an argument (attribute) with a Terraform map type value (as distinguished from Golang map, which can also be used to specify a Terraform block value), and a Terraform block. These are essentially equivalent in the fact that they allow passing key value pairs as values, but there are some differences. Here is a bit a of a binary decision tree for which to use:
Is the ideal Terraform value type a map or an object (i.e. should the keys follow a naming schema, or can the keys be named almost anything)?
map: attribute
object: block
If the value is changed, does that force a Delete and Create, or it is possible to instead Update?
DC: usually block
U: usually attribute
Is there another resource in the provider that could replace the usage in the current resource (different API usage) e.g. in your example above would there be another resource exclusively devoted to assigning the connection?
yes: usually block
no: usually attribute
Is the value multi-level (multiple levels of key value pairs), or single level?
single level: attribute leads to better code because simpler and cleaner
multi-level: block leads to better code because nested blocks
There may be other deciding factors I cannot recall, but these will hopefully guide in the right direction.
If you are developing an entirely greenfield provider, and so you don't need to remain compatible with any existing usage, I would suggest considering using the Terraform Plugin Framework (instead of "SDKv2") which is being designed around the type system and behaviors of modern Terraform, whereas the older SDK was designed for much older Terraform versions which had a much more restrictive configuration language.
In particular, the new framework encourages using the attribute-style syntax exclusively, by allowing you to declare certain attributes as having nested attributes, which then support most of the same internal structures that blocks would allow but using the syntax of assigning a value to a name, rather than the nested block syntax.
The original intent of nested blocks was to represent the sense of declaring a separate object that happened to "belong to" the containing object, rather than declaring an argument of that top-level object. That distinction was murky in practice, since underlying APIs often represent these nested objects as JSON arrays or maps inside the top-level object anyway, and so the additional abstraction of showing them as separate objects ends up hurting rather than helping, because it obscures the nature of the underlying data structure. Particularly if the physical representation of the concept in the API is as a nested data structure inside the containing object, I think it's most helpful to use a comparable data structure in Terraform.
Many uses of nested block types in existing providers are either concessions to backward compatibility or constraints caused by those providers still being written against SDKv2, and thus not having the ability to declare a structured attribute type -- such a concept did not exist in Terraform v0.11 and earlier, which is what SDKv2 was designed for.
The plugin framework does still support declaring blocks, and there are some situations where the nested item really is a separate object that just happens to be conceptually contained within another where the block syntax could still be a reasonable choice. However, my recommendation would be to default to using the attribute syntax in most cases.
At the time I'm writing this, the Plugin Framework is still relatively new and its design not entirely settled. Therefore when considering whether to use it, I suggest to consult Which SDK Should I Use? in order to make an informed decision.

How can a nested block have a label in a custom terraform provider?

I am writing a provider for terraform and I'm trying to work out how to get labels on nested blocks.
https://www.terraform.io/docs/language/syntax/configuration.html#blocks states "A particular block type may have any number of required labels, or it may require none as with the nested network_interface block type."
If I have something like this in my config
resource "myprovider_thing" "test" {
x = 1
settings "set1" {
y = 2
}
}
where the settings in the schema has a Type of schema.TypeSet with an Elem of type &schema.Resource. When I try a plan I get told there is an extraneous label and no labels are expected for the block.
I can't find anything explaining how to set that an element requires a label or how to access it.
Is it possible to have a nested block with a label or am I misunderstanding what is written on the configuration page?
The syntax documentation you referred to is making a general statement about the Terraform language grammar, but the mechanism of labeled blocks is generally reserved for constructs built in to the Terraform language, like resource blocks as we can see in your example.
The usual approach within resource-type-specific arguments is to create map-typed arguments and have the user assign map values to them, rather than using the block syntax. That approach also makes it easier for users to dynamically construct the map, for situations where statically-defined labels are not sufficient, because they can use arbitrary Terraform language expressions to generate the value.
The current Terraform SDK is built around the capabilities of all historical versions of Terraform and so it only supports maps of primitive types as a result of design constraints in Terraform v0.11 and earlier. That means there isn't a way to specify a map of objects, which would be the closest analog to a nested block type with a label.
At the time I'm writing this answer there is a new library under active development called Terraform Plugin Framework, which ends support for Terraform versions prior to v0.12 but then in return gets to make use of Terraform features introduced in that release, including the possibility of declaring maps of objects in your provider schema. It remains experimental at the time I'm writing this because the team is still iterating on the best way to represent all of the different capabilities, but for a green-field provider it could be a reasonable foundation if you're willing to respond to any potential breaking changes that might occur on the way to it reaching its first stable release.

How pass entirety of hiera data into a script from puppet?

I would like to transition from using puppet to plain old scripts. During this transition I would like for scripts to access the information in hiera. Is there a way for puppet to pass the all key value pairs to a script as an argument through an exec? If I could get puppet to pass a json blob of hiera into a script that would be perfect.
Through experimentation in my hiera file contains
{
"a" : ["a, b"],
"b" : "b",
"c" : {
"a" : {
"b" : "c"
}
}
}
hiera("a"): "ab"
hiera("b"): "b"
hiera("c"): ""
hiera(""): ""
Ideally I'd like to pass the entire json string from all hiera data sources into my scripts from puppet's exec? Can anyone confirm if this is possible, or if there is some work around?
Is there a way for puppet to pass the all key value pairs to a script as an argument through an exec?
Not in general, no, because Hiera resolutions can be context-sensitive. There is no guarantee that "all key value pairs" is well-defined on a whole-node basis. Therefore, before even talking about how the data can be exchanged, you need to deal with the problem of what the data are.
Even if you suppose that none of the Hiera facilities are in use that contextualize data more narrowly than on a node-by-node basis, and that you need be concerned only with priority lookups (not array- or hash-merge lookups), Hiera has no built-in facility for compiling a composite of all data pertaining to a node. If you're dealing only with the standard YAML or JSON back-end, however, you could probably create and use your own hacked version that extracts the desired data, maybe as the value of some special key.
Even then, however, passing the data themselves as a command-line argument is highly questionable. You would need, first, to serialize the data into a form that will be interpreted as a single shell word. That surely can be automated, but next you risk running afoul of argument-length limits. A conforming POSIX system can impose a maximum argument length as small as 4096 bytes (though many have much larger limits) and that could easily be too little. And if you're trying to do this with Windows then know that its limits are even smaller.
As an alternative to passing the data as a command-line argument, you could consider writing them to a file that your script(s) will read. Even that seems a bit silly, however. Hiera has a CLI -- why not just distribute the Hiera data and hierarchy configuration, and have your scripts use Hiera to query the needed data from it?
In general, this is no supposed to work.
You might be able to write a custom Hiera backend that recognizes a special lookup key and return a fully merged hash of all data from the hierarchy.
I am not sure I have got the question correctly, but I assume you are talking about some templating? You can use templates and put value in the placeholders.
Check out this link, if this helps.
http://codingbee.net/tutorials/puppet/puppet-generate-files-templates-using-hiera-data/

Resources