Pattern for templating arguments for existing Terraform resources - terraform

I'm using the Terraform GitHub provider to define GitHub repositories for an internal GitHub Enterprise instance (although the question isn't provider-specific).
The existing github_repository resource works fine, but I'd like to be able to set non-standard defaults for some of its arguments, and easily group other arguments under a single new argument.
e.g.
github_repository's private value defaults to false but I'd like to default to true
Many repos will only want to allow squash merges, so having a squash_merge_only parameter which sets allow_squash_merge = true, allow_rebase_merge = false, allow_merge_commit = false
There are more cases but these illustrate the point. The intention is to make it simple for people to configure new repos correctly and to avoid having large amounts of config repeated across every repo.
I can achieve this by passing variables into a custom module, e.g. something along the lines of:
Foo/custom_repo/main.tf:
resource "github_repository" "custom_repo" {
name = ${var.repo_name}
private = true
allow_squash_merge = true
allow_merge_commit = ${var.squash_merge_only ? false : true}
allow_rebase_merge = ${var.squash_merge_only ? false : true}
}
Foo/main.tf:
provider "github" {
...
}
module "MyRepo_module" {
source = "./custom_repo"
repo_name = "MyRepo"
squash_merge_only = true
}
This is a bit rubbish though, as I have to add a variable for every other argument on github_repository that people using the custom_repo module might want to set (which is basically all of them - I'm not trying to restrict what people are allowed to do) - see name and repo_name on the example. This all then needs documenting separately, which is also a shame given that there are good docs for the existing provider.
Is there a better pattern for reusing existing resources like this but having some control over how arguments get passed to them?

We created an opinionated module (terraform 0.12+) for this at https://github.com/mineiros-io/terraform-github-repository
We set all the defaults to values we think fit best, but basically you can create a set of local defaults and reuse them when calling the module multiple times.
fun fact... your desired defaults are already the module's default right away, but to be clear how to set those explicitly here is an example (untested):
locals {
my_defaults = {
# actually already the modules default to create private repositories
private = true
# also the modules default already and
# all other strategies are disabled by default
allow_squash_merge = true
}
}
module "repository" {
source = "mineiros-io/repository/github"
version = "0.4.0"
name = "my_new_repository"
defaults = local.my_defaults
}
not all arguments are supported as defaults yet, buts most are: https://github.com/mineiros-io/terraform-github-repository#defaults-object-attributes

Related

Determining whether a value is known at plan-time

Terraform allows values to be marked as "unknown" during the plan step, since many values may only be known after apply of certain resources.
Is there any way, during the plan step, to check if a value is known or unknown?
Specifically, I'd like to be able to do something like this:
locals {
foo = "hello world"
bar = uuid()
}
output "foo_known" {
value = knownatplan(local.foo)
}
output "bar_known" {
value = knownatplan(local.bar)
}
Outputs:
foo_known = true
bar_known = false
Where knownatplan would be a function, or some sort of other mechanism, to determine if the value is known at plan time.
There is no mechanism to do this because to do so would break an assumption that Terraform relies on to do its work: that replacing unknown values with known values during the final apply step can only add information, never change information. The above guarantee is fundamental to the concept of planning a change before taking any side-effects.
Other systems whose language doesn't have this mechanism can only provide a hypothetical "dry run" of changes that may not be complete or accurate, whereas Terraform aims to make the additional promise that it will either make the changes as shown in the plan or return an error explaining why it cannot. Any situation where applying the plan succeeds but generates a result other than what the plan reported is always considered to be a bug, either in Terraform Core itself or in the relevant provider. Unknown values are a big part of how Terraform keeps that promise.
I wrote about this in more detail in a blog article Unknown Values: The Secret to Terraform Plan.
I had a little bit time now, and could implement a cool "feature" the inventors of terraform would not much like. ^-^
I am author of terraform-provider-value (github) (terraform registry) and there I have two resources called value_is_fully_known (link) and value_is_known (link). Sounds good? Just look look in the documentation or look here for some examples.
Fundamentally they allow you to have a true or false for a value that might be "(known after apply)" before you actually apply! Here an example:
terraform {
required_providers {
value = {
source = "pseudo-dynamic/value"
version = "0.5.1"
}
}
}
locals {
foo = "hello world"
bar = uuid()
}
resource "value_unknown_proposer" "default" {}
resource "value_is_known" "foo" {
value = local.foo
guid_seed = "foo"
proposed_unknown = value_unknown_proposer.default.value
}
resource "value_is_known" "bar" {
value = local.bar
guid_seed = "bar"
proposed_unknown = value_unknown_proposer.default.value
}
output "foo_known" {
value = value_is_known.foo.result
}
output "bar_known" {
value = value_is_known.bar.result
}
which results into:
Outputs:
bar_known = false
foo_known = true
Before actual apply?
Concretely you are running through two plan-phases, one I call plan-phase and the other one I call apply-phase, which includes another, independent and implicit plan-phase.
The plan-phase is characteristical when you see the plan, the potencial results (most of the times with a lot of "known after apply"-values") and the message "Terraform will perform the following actions:".
The apply-phase on the other side is, when all values are calculated in the view of the provider author, terraform or you when you see "Apply complete!".
It was very challenging to implement an acceptable solution because during the plan-phase the provider has no chance to to save anything because nothing is persistent. Only changes in the implicit plan-phase are stored when apply-phase was successful.
My thoughts
To know before you apply whether a value is known or unknown, can enable you some cool workflows in terraform but they should be very limited. I can only encourage you to use this mechanism as rarely as possible. But if you have some cool usages for it, do you mind to share them? Feel free to open pull request to add an example or begin a discussion.
Any form of feedback (constructive critism, bugs and ideas) I appreciate a lot. =)

How to solve for_each + "Terraform cannot predict how many instances will be created" issue?

I am trying to create a GCP project with this:
module "project-factory" {
source = "terraform-google-modules/project-factory/google"
version = "11.2.3"
name = var.project_name
random_project_id = "true"
org_id = var.organization_id
folder_id = var.folder_id
billing_account = var.billing_account
activate_apis = [
"iam.googleapis.com",
"run.googleapis.com"
]
}
After that, I am trying to create a service account, like so:
module "service_accounts" {
source = "terraform-google-modules/service-accounts/google"
version = "4.0.3"
project_id = module.project-factory.project_id
generate_keys = "true"
names = ["backend-runner"]
project_roles = [
"${module.project-factory.project_id}=>roles/cloudsql.client",
"${module.project-factory.project_id}=>roles/pubsub.publisher"
]
}
To be honest, I am fairly new to Terraform. I have read a few answers on the topic (this and this) but I am unable to understand how that would apply here.
I am getting the error:
│ Error: Invalid for_each argument
│
│ on .terraform/modules/pubsub-exporter-service-account/main.tf line 47, in resource "google_project_iam_member" "project-roles":
│ 47: for_each = local.project_roles_map_data
│ ├────────────────
│ │ local.project_roles_map_data will be known only after apply
│
│ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the
│ -target argument to first apply only the resources that the for_each depends on.
Looking forward to learn more about Terraform through this challenge.
With only parts of the configuration visible here I'm guessing a little bit, but let's see. You mentioned that you'd like to learn more about Terraform as part of this exercise, so I'm going to go into a lot of detail about the chain here to explain why I'm recommending what I'm going to recommend, though you can skip to the end if you find this extra detail uninteresting.
We'll start with that first module's definition of its project_id output value:
output "project_id" {
value = module.project-factory.project_id
}
module.project-factory here is referring to a nested module call, so we need to look one level deeper in the nested module terraform-google-modules/project-factory/google//modules/core_project_factory:
output "project_id" {
value = module.project_services.project_id
depends_on = [
module.project_services,
google_project.main,
google_compute_shared_vpc_service_project.shared_vpc_attachment,
google_compute_shared_vpc_host_project.shared_vpc_host,
]
}
Another nested module call! 😬 That one declares its project_id like this:
output "project_id" {
description = "The GCP project you want to enable APIs on"
value = element(concat([for v in google_project_service.project_services : v.project], [var.project_id]), 0)
}
Phew! 😅 Finally an actual resource. This expression in this case seems to be taking the project attribute of a google_project_service resource instance, or potentially taking it from var.project_id if that resource was disabled in this instance of the module. Let's have a look at the google_project_service.project_services definition:
resource "google_project_service" "project_services" {
for_each = local.services
project = var.project_id
service = each.value
disable_on_destroy = var.disable_services_on_destroy
disable_dependent_services = var.disable_dependent_services
}
project here is set to var.project_id, so it seems like either way this innermost project_id output just reflects back the value of the project_id input variable, so we need to jump back up one level and look at the module call to this module to see what that was set to:
module "project_services" {
source = "../project_services"
project_id = google_project.main.project_id
activate_apis = local.activate_apis
activate_api_identities = var.activate_api_identities
disable_services_on_destroy = var.disable_services_on_destroy
disable_dependent_services = var.disable_dependent_services
}
project_id is set to the project_id attribute of google_project.main:
resource "google_project" "main" {
name = var.name
project_id = local.temp_project_id
org_id = local.project_org_id
folder_id = local.project_folder_id
billing_account = var.billing_account
auto_create_network = var.auto_create_network
labels = var.labels
}
project_id here is set to local.temp_project_id, which is declared further up in the same file:
temp_project_id = var.random_project_id ? format(
"%s-%s",
local.base_project_id,
random_id.random_project_id_suffix.hex,
) : local.base_project_id
This expression includes a reference to random_id.random_project_id_suffix.hex, and .hex is a result attribute from random_id, and so its value won't be known until apply time due to how that random_id resource type is implemented. (It generates a random value during the apply step and saves it in the state so it'll stay consistent on future runs.)
This means that (after all of this indirection) module.project-factory.project_id in your module is not a value defined statically in the configuration, and might instead be decided dynamically during the apply step. That means it's not an appropriate value to use as part of the instance key of a resource, and thus not appropriate to use as a key in a for_each map.
Unfortunately the use of for_each here is hidden inside this other module terraform-google-modules/service-accounts/google, and so we'll need to have a look at that one too and see how it's making use of the project_roles input variable. First, let's look at the specific resource block the error message was talking about:
resource "google_project_iam_member" "project-roles" {
for_each = local.project_roles_map_data
project = element(
split(
"=>",
each.value.role
),
0,
)
role = element(
split(
"=>",
each.value.role
),
1,
)
member = "serviceAccount:${google_service_account.service_accounts[each.value.name].email}"
}
There's a couple somewhat-complex things going on here, but the most relevant thing for what we're looking at here is that this resource configuration is creating multiple instances based on the content of local.project_roles_map_data. Let's look at local.project_roles_map_data now:
project_roles_map_data = zipmap(
[for pair in local.name_role_pairs : "${pair[0]}-${pair[1]}"],
[for pair in local.name_role_pairs : {
name = pair[0]
role = pair[1]
}]
)
A little more complexity here that isn't super important to what we're looking for; the main thing to consider here is that this is constructing a map whose keys are built from element zero and element one of local.name_role_pairs, which is declared directly above, along with local.names that it refers to:
names = toset(var.names)
name_role_pairs = setproduct(local.names, toset(var.project_roles))
So what we've learned here is that the values in var.names and the values in var.project_roles both contribute to the keys of the for_each on that resource, which means that neither of those variable values should contain anything decided dynamically during the apply step.
However, we've also learned (above) that the project and role arguments of google_project_iam_member.project-roles are derived from the prefixes of elements in the two lists you provided as names and project_roles in your own module call.
Let's return back to where we started then, with all of this extra information in mind:
module "service_accounts" {
source = "terraform-google-modules/service-accounts/google"
version = "4.0.3"
project_id = module.project-factory.project_id
generate_keys = "true"
names = ["backend-runner"]
project_roles = [
"${module.project-factory.project_id}=>roles/cloudsql.client",
"${module.project-factory.project_id}=>roles/pubsub.publisher"
]
}
We've learned that names and project_roles must both contain only static values decided in the configuration, and so it isn't appropriate to use module.project-factory.project_id because that won't be known until the random project ID has been generated during the apply step.
However, we also know that this module is expecting the prefix of each item in project_roles (the part before the =>) to be a valid project ID, so there isn't any other value that would be reasonable to use there.
Therefore we're at a bit of an empasse: this second module has a rather awkward design decision that it's trying to derive a both a local instance key and a reference to a real remote object from the same value, and those two situations have conflicting requirements. But this isn't a module you created, so you can't easily modify it to address that design quirk.
Given that, I see two possible approaches to move forward, neither ideal but both workable with some caveats:
You could take the approach the error message offered as a workaround, asking Terraform to plan and apply the resources in the first module alone first, and then plan and apply the rest on a subsequent run once the project ID is already decided and recorded in the state:
terraform apply -target=module.factory
terraform apply
Although it's annoying to have to do this initial create in two steps, it does at least only matter for the initial creation of this infrastructure. If you update it later then you won't need to repeat this two-step process unless you've changed the configuration in a way that requires generating a new project ID.
While working through the above we saw that this approach of generating and returning a random project ID was optional based on that first module's var.random_project_id, which you set to "true" in your configuration. Without that, the project_id output would be just a copy of your given name argument, which seems to be statically defined by reference to a root module variable.
Unless you particularly need that random suffix on your project ID, you could leave random_project_id unset and thus just get the project ID set to the same static value as your var.project_name, which should then be an acceptable value to use as a for_each key.
Ideally this second module would be designed to separate the values it's using for instance keys from the values it's using to refer to real remote objects, and thus it would be possible to use the random-suffixed name for the remote object but a statically-defined name for the local object. If this were a module under your control then I would've suggested a design change like that, but I assume the current unusual design of that third-party module (packing multiple values into a single string with a delimiter) is a compromise resulting from wanting to retain backward compatibility with an earlier iteration of the module.

Terraform v0.13 - Check if password or secret provided, use a randomly generated one if not

I'm working to fine-tune some of my Terraform modules, specifically around the google_compute_vpn_tunnel, google_compute_router_interface, and google_compute_router_peer resources. I'd like to make things similar to AWS, where pre-shared keys and tunnel interface IP addresses are randomized by default, but can be overridden by the user (provided they are within a certain range).
The random option is working fine. For example, to create a 20-character random password, I do this:
resource "random_password" "RANDOM_PSK" {
length = 20
special = false
}
But, I only want to use this value if an input variable called vpn_shared_secret was not defined. Seems like this should work:
variable "vpn_shared_secret" {
type = string
default = null
}
locals {
vpn_shared_secret = try(var.vpn_shared_secret, random_password.RANDOM_PSK.result)
}
resource "google_compute_vpn_tunnel" "VPN_TUNNEL" {
shared_secret = local.vpn_shared_secret
}
Instead, it seems to ignore the vpn_shared_secret input variable and just go with the randomly generated one each time.
Is try() the correct way to be doing this? I'm just now learning Terraform if/else and map statements.
How about the coalesce() function?
The coalesce function takes any number of arguments, and returns the first argument that isn't null or an empty string.
locals {
vpn_shared_secret = coalesce(var.vpn_shared_secret, random_password.RANDOM_PSK.result)
}

How to extend terraform module input variable schema without breaking existing clients?

I have a module with the following input variable:
variable "apsvc_map" {
description = "The App Services sharing the same App Service Plan. Maps an App Service name to its properties."
type = map(object({
identity_ids = list(string),
disabled = bool
}))
}
Now I would like to add a new property to the schema - no_custom_hostname_binding. The new version would be:
variable "apsvc_map" {
description = "The App Services sharing the same App Service Plan. Maps an App Service name to its properties."
type = map(object({
identity_ids = list(string),
disabled = bool
no_custom_hostname_binding = bool
}))
}
And this change can be made backwards compatible in the module code with the help of the try function, because omitting the new property is equivalent to providing it with the false value.
However, terraform treats this schema strictly and would not allow passing an input without the new field:
2020-05-30T15:34:20.8061749Z Error: Invalid value for module argument
2020-05-30T15:34:20.8062005Z
2020-05-30T15:34:20.8062205Z on ..\..\modules\web\main.tf line 47, in module "web":
2020-05-30T15:34:20.8062336Z 47: apsvc_map = {
2020-05-30T15:34:20.8062484Z 48: dfhub = {
2020-05-30T15:34:20.8062727Z 49: disabled = false
2020-05-30T15:34:20.8065156Z 50: identity_ids = [local.identity_id]
2020-05-30T15:34:20.8065370Z 51: }
2020-05-30T15:34:20.8065459Z 52: }
2020-05-30T15:34:20.8065538Z
I understand from the error that terraform complains because I did not specify the value for the new property in the input.
So, there are three solutions:
Update all the existing code to add the new property - out of the question.
Tag the new version of the module differently and let the new code reference the new tag, while the old code continues to reference the old tag - in the long run would lead to proliferation of tags, creating all kinds of bizarre Cartesian multiplications of features in the tag names. Ultimately - out of the question.
Relax the input variable schema by commenting out the optional properties and use try in the code.
The last option is not ideal, because the documentation for the module would not list the optional properties. But from the code management perspective - it is the best.
So the question is - can input object properties be defined as optional? Ideally, it should include the default value, but I am OK with the try approach for now.
EDIT 1
I actually thought I could pass unknown properties in the object, but no. Once the schema is given it is nothing less nothing more. So, the only backwards compatible solution is to use map(any) in my case.
Optional arguments in object variable have been suggested for Terraform:
https://github.com/hashicorp/terraform/issues/19898
Unfortunately as of May 30 2020, there has not been any progress on this.
That is the most upvoted issue on their repo, all we can do is keep upvoting and hopefully, that will be implemented soon.
And you are right the alternatives are just out of the question or plain hackish
Given your options, your preferences, and the fact that Terraform 0.12 doesn't support and Terraform 0.13 likely won't support optional or default values on objects, I think you have a fourth option:
variable "apsvc_map" {
description = "The App Services sharing the same App Service Plan. Maps an App Service name to its properties."
default = {}
type = map(object({
identity_ids = list(string),
disabled = bool
}))
}
variable "no_custom_hostname_binding" {
description = "Whether or not an App Service should disable hostname binding. Maps an App Service name to an override of the no_custom_hostname_binding property."
type = map(bool)
}
From there, you can use it like this:
lookup(var.no_custom_hostname_binding[local.awsvpc_map_key], null)
And declare overrides like this:
no_custom_hostname_binding = {
"vpc_key" = true
}
in expressions where you need to know that parameter. This is not super-elegant, but without optional parameters, you don't have many good alternatives.
You can follow this pattern to add as many optional overrides as you need and add more later also without breaking clients.

Is it possible to generate a variable name in terraform

So i want to get the variable in the terraform remote state, however we have a number of different one per environment on the shared route53
So for a given environement, we want to pull the zone id out as such;
zone_id = data.terraform_remote_state.route_53.route53_zone_${var.environment}_id
How would I do this please.
In general, it is not possible to use arbitrary dynamic strings as variable names.
However, in this particular case the outputs from terraform_remote_state are collection values and so you can use the index syntax to access a dynamically-built key from your map value:
data.terraform_remote_state.outputs.route53["route53_zone_${var.environment}_id"]
With that said, if possible I would recommend structuring the output values better so that the Route53 zone ids are given as a map by environment, so that this can be obtained in a more intuitive way.
For example, you could make your route53 output be a map of objects whose keys are the environment names:
data.terraform_remote_state.outputs.route53[var.environment].zone_id
output "route53" {
value = tomap({
production = {
zone_id = aws_route53_zone.production.id
}
staging = {
zone_id = aws_route53_zone.staging.id
}
})
}
Or, if you have a variety of different per-environment settings you could structure it as a single output value that is a map of all of those per environment settings keyed by environment name:
data.terraform_remote_state.outputs.environments[var.environment].route53_zone_id
output "environments" {
value = tomap({
production = {
ec2_vpc_id = aws_vpc.production.id
route53_zone_id = aws_route53_zone.production.id
}
staging = {
ec2_vpc_id = aws_vpc.staging.id
route53_zone_id = aws_route53_zone.staging.id
}
})
}
This doesn't change anything about the ultimate result, but grouping things by your environment keys in your outputs is likely to make your intent clearer to future maintainers of these configurations.
(You might also consider whether it'd be better to have a separate configuration/state per environment rather than managing them altogether, but that is a big topic in itself.)

Resources