Get a list of possible outbound ip addresses in terraform - azure

I'm trying to use the export from a azure function app in terraform to get the possible outbound ip addresses that I can add to a whitelist for a firewall
The parameter returned is a string of ips comma separated.
I have tried using the split function within terraform but it doesn't give a list, it gives an interface which can't be used as a list. I've tried using local scopes to add square brackets around it but still the same.
Let me just add this is terraform 11 not 12.
resource "azurerm_key_vault" "keyvault" {
name = "${var.project_name}-${var.environment}-kv"
location = "${azurerm_resource_group.environment.location}"
resource_group_name = "${azurerm_resource_group.environment.name}"
enabled_for_disk_encryption = true
tenant_id = "${var.tenant_id}"
sku_name = "standard"
network_acls {
bypass = "AzureServices"
default_action = "Deny"
ip_rules = "${split(",", azurerm_function_app.function.possible_outbound_ip_addresses)}"
}
tags = {
asset-code = "${var.storage_tags["asset_code"]}"
module-code = "${var.storage_tags["module_code"]}"
environment = "${var.environment}"
instance-code = "${var.storage_tags["instance_code"]}"
source = "terraform"
}
}
This comes back with the error "ip_rules must be a list".
Thanks

I think what you are seeing here is a classic Terraform 0.11 design flaw: when a value is unknown at plan time (because it will be decided only during apply), Terraform 0.11 can't properly track the type information for it.
Because possible_outbound_ip_addresses is an unknown value at planning time, the result of split with that string is also unknown. Because Terraform doesn't track type information for that result, the provider SDK code rejects that unknown value because it isn't a list.
To address this in Terraform 0.11 requires doing your initial run with the -target argument so that Terraform can focus on creating the function (and thus allocating its outbound IP addresses) first, and then deal with the processing of that string separately once it's known:
terraform apply -target=azurerm_function_app.function
terraform apply # to complete the rest of the work that -target excluded
Terraform 0.12 addressed this limitation by tracking type information for both known and unknown values, so in Terraform 0.12 the split function would see that you gave it an unknown string and accept that as being correctly typed, and then it would return an unknown list of strings to serve as a placeholder for the result that will be finally determined during the apply phase.

If is var.string is 1.2.3.4,5.6.7.8-
split(',', var.string)[0] should give you back 1.2.3.4 as a string. Your questions is difficult without an example.

Here is an example of how I can get a list of possible IPs
create a data source and then a locals var
app_services = [ "app1", "app2", "app3" ]
data "azurerm_app_service" "outbound_ips" {
count = length(var.app_services)
name = var.app_services[count.index]
resource_group_name = var.server_resource_group_name
}
locals {
apps_outbound_ips = distinct(flatten(concat(data.azurerm_app_service.outbound_ips.*.possible_outbound_ip_address_list)))
}
You don't have to use a data source either, if you are building the resource just use the outputs instead of a data source, in my case I use a data source as I build my apps separately.
Works flawlessly for me and produces a list of strings (Strings being each unique outbound IP of the set of app services / function apps) in the form of local.apps_outbound_ips
Enjoy :)

Related

How to solve for_each + "Terraform cannot predict how many instances will be created" issue?

I am trying to create a GCP project with this:
module "project-factory" {
source = "terraform-google-modules/project-factory/google"
version = "11.2.3"
name = var.project_name
random_project_id = "true"
org_id = var.organization_id
folder_id = var.folder_id
billing_account = var.billing_account
activate_apis = [
"iam.googleapis.com",
"run.googleapis.com"
]
}
After that, I am trying to create a service account, like so:
module "service_accounts" {
source = "terraform-google-modules/service-accounts/google"
version = "4.0.3"
project_id = module.project-factory.project_id
generate_keys = "true"
names = ["backend-runner"]
project_roles = [
"${module.project-factory.project_id}=>roles/cloudsql.client",
"${module.project-factory.project_id}=>roles/pubsub.publisher"
]
}
To be honest, I am fairly new to Terraform. I have read a few answers on the topic (this and this) but I am unable to understand how that would apply here.
I am getting the error:
│ Error: Invalid for_each argument
│
│ on .terraform/modules/pubsub-exporter-service-account/main.tf line 47, in resource "google_project_iam_member" "project-roles":
│ 47: for_each = local.project_roles_map_data
│ ├────────────────
│ │ local.project_roles_map_data will be known only after apply
│
│ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the
│ -target argument to first apply only the resources that the for_each depends on.
Looking forward to learn more about Terraform through this challenge.
With only parts of the configuration visible here I'm guessing a little bit, but let's see. You mentioned that you'd like to learn more about Terraform as part of this exercise, so I'm going to go into a lot of detail about the chain here to explain why I'm recommending what I'm going to recommend, though you can skip to the end if you find this extra detail uninteresting.
We'll start with that first module's definition of its project_id output value:
output "project_id" {
value = module.project-factory.project_id
}
module.project-factory here is referring to a nested module call, so we need to look one level deeper in the nested module terraform-google-modules/project-factory/google//modules/core_project_factory:
output "project_id" {
value = module.project_services.project_id
depends_on = [
module.project_services,
google_project.main,
google_compute_shared_vpc_service_project.shared_vpc_attachment,
google_compute_shared_vpc_host_project.shared_vpc_host,
]
}
Another nested module call! 😬 That one declares its project_id like this:
output "project_id" {
description = "The GCP project you want to enable APIs on"
value = element(concat([for v in google_project_service.project_services : v.project], [var.project_id]), 0)
}
Phew! 😅 Finally an actual resource. This expression in this case seems to be taking the project attribute of a google_project_service resource instance, or potentially taking it from var.project_id if that resource was disabled in this instance of the module. Let's have a look at the google_project_service.project_services definition:
resource "google_project_service" "project_services" {
for_each = local.services
project = var.project_id
service = each.value
disable_on_destroy = var.disable_services_on_destroy
disable_dependent_services = var.disable_dependent_services
}
project here is set to var.project_id, so it seems like either way this innermost project_id output just reflects back the value of the project_id input variable, so we need to jump back up one level and look at the module call to this module to see what that was set to:
module "project_services" {
source = "../project_services"
project_id = google_project.main.project_id
activate_apis = local.activate_apis
activate_api_identities = var.activate_api_identities
disable_services_on_destroy = var.disable_services_on_destroy
disable_dependent_services = var.disable_dependent_services
}
project_id is set to the project_id attribute of google_project.main:
resource "google_project" "main" {
name = var.name
project_id = local.temp_project_id
org_id = local.project_org_id
folder_id = local.project_folder_id
billing_account = var.billing_account
auto_create_network = var.auto_create_network
labels = var.labels
}
project_id here is set to local.temp_project_id, which is declared further up in the same file:
temp_project_id = var.random_project_id ? format(
"%s-%s",
local.base_project_id,
random_id.random_project_id_suffix.hex,
) : local.base_project_id
This expression includes a reference to random_id.random_project_id_suffix.hex, and .hex is a result attribute from random_id, and so its value won't be known until apply time due to how that random_id resource type is implemented. (It generates a random value during the apply step and saves it in the state so it'll stay consistent on future runs.)
This means that (after all of this indirection) module.project-factory.project_id in your module is not a value defined statically in the configuration, and might instead be decided dynamically during the apply step. That means it's not an appropriate value to use as part of the instance key of a resource, and thus not appropriate to use as a key in a for_each map.
Unfortunately the use of for_each here is hidden inside this other module terraform-google-modules/service-accounts/google, and so we'll need to have a look at that one too and see how it's making use of the project_roles input variable. First, let's look at the specific resource block the error message was talking about:
resource "google_project_iam_member" "project-roles" {
for_each = local.project_roles_map_data
project = element(
split(
"=>",
each.value.role
),
0,
)
role = element(
split(
"=>",
each.value.role
),
1,
)
member = "serviceAccount:${google_service_account.service_accounts[each.value.name].email}"
}
There's a couple somewhat-complex things going on here, but the most relevant thing for what we're looking at here is that this resource configuration is creating multiple instances based on the content of local.project_roles_map_data. Let's look at local.project_roles_map_data now:
project_roles_map_data = zipmap(
[for pair in local.name_role_pairs : "${pair[0]}-${pair[1]}"],
[for pair in local.name_role_pairs : {
name = pair[0]
role = pair[1]
}]
)
A little more complexity here that isn't super important to what we're looking for; the main thing to consider here is that this is constructing a map whose keys are built from element zero and element one of local.name_role_pairs, which is declared directly above, along with local.names that it refers to:
names = toset(var.names)
name_role_pairs = setproduct(local.names, toset(var.project_roles))
So what we've learned here is that the values in var.names and the values in var.project_roles both contribute to the keys of the for_each on that resource, which means that neither of those variable values should contain anything decided dynamically during the apply step.
However, we've also learned (above) that the project and role arguments of google_project_iam_member.project-roles are derived from the prefixes of elements in the two lists you provided as names and project_roles in your own module call.
Let's return back to where we started then, with all of this extra information in mind:
module "service_accounts" {
source = "terraform-google-modules/service-accounts/google"
version = "4.0.3"
project_id = module.project-factory.project_id
generate_keys = "true"
names = ["backend-runner"]
project_roles = [
"${module.project-factory.project_id}=>roles/cloudsql.client",
"${module.project-factory.project_id}=>roles/pubsub.publisher"
]
}
We've learned that names and project_roles must both contain only static values decided in the configuration, and so it isn't appropriate to use module.project-factory.project_id because that won't be known until the random project ID has been generated during the apply step.
However, we also know that this module is expecting the prefix of each item in project_roles (the part before the =>) to be a valid project ID, so there isn't any other value that would be reasonable to use there.
Therefore we're at a bit of an empasse: this second module has a rather awkward design decision that it's trying to derive a both a local instance key and a reference to a real remote object from the same value, and those two situations have conflicting requirements. But this isn't a module you created, so you can't easily modify it to address that design quirk.
Given that, I see two possible approaches to move forward, neither ideal but both workable with some caveats:
You could take the approach the error message offered as a workaround, asking Terraform to plan and apply the resources in the first module alone first, and then plan and apply the rest on a subsequent run once the project ID is already decided and recorded in the state:
terraform apply -target=module.factory
terraform apply
Although it's annoying to have to do this initial create in two steps, it does at least only matter for the initial creation of this infrastructure. If you update it later then you won't need to repeat this two-step process unless you've changed the configuration in a way that requires generating a new project ID.
While working through the above we saw that this approach of generating and returning a random project ID was optional based on that first module's var.random_project_id, which you set to "true" in your configuration. Without that, the project_id output would be just a copy of your given name argument, which seems to be statically defined by reference to a root module variable.
Unless you particularly need that random suffix on your project ID, you could leave random_project_id unset and thus just get the project ID set to the same static value as your var.project_name, which should then be an acceptable value to use as a for_each key.
Ideally this second module would be designed to separate the values it's using for instance keys from the values it's using to refer to real remote objects, and thus it would be possible to use the random-suffixed name for the remote object but a statically-defined name for the local object. If this were a module under your control then I would've suggested a design change like that, but I assume the current unusual design of that third-party module (packing multiple values into a single string with a delimiter) is a compromise resulting from wanting to retain backward compatibility with an earlier iteration of the module.

What is the iterator feature for in Terraform's for_each?

I am trying to understand the iterator feature of the for_each in Terraform 0.12. The docs say:
Iterator:
The iterator argument (optional) sets the name of a temporary variable that represents the current element of the complex value. If omitted, the name of the variable defaults to the label of the dynamic block ...
But I can't find any code examples that uses this feature and I can't get my head around what it is for. I have read the Terraform 0.12 preview but it is not mentioned there, and I found some GitHub issues (e.g. this one) but can't find clues there either.
Is it just for improving readability? I would really appreciate a code example and explanation that goes beyond what I can find in the docs.
Basically in various languages like Python, Ruby, C++, Javascript, Groovy, etc. you can establish a temporary variable within a lambda (especially if it is iterative) that stores the temporary value per iteration within the lambda. In some languages (e.g. Groovy), there is a default name for this variable, or you can set one yourself (i.e. default variable name in Groovy is it). For example, in Groovy we have:
strings.each() {
print it
}
would print the content of the string variable assignment (assuming it can be cast to String). The following code has the exact same functionality:
strings.each() { a_string ->
print a_string
}
where we have explicitly named the temporary variable as a_string. This is analogous to the iterator argument in your question. So in Terraform, we see an example in the documentation:
resource "aws_security_group" "example" {
name = "example" # can use expressions here
dynamic "ingress" {
for_each = var.service_ports
content {
from_port = ingress.value
to_port = ingress.value
protocol = "tcp"
}
}
}
According to the documentation:
If omitted, the name of the variable defaults to the label of the dynamic block
and the name above is ingress (notice it is the label specified adjacent to the dynamic block). Sure enough, we see the name of the temporary variable above is ingress and it is being accessed via ingress.value. To utilize the functionality of iterator to rename this temporary variable, we can do something like the below.
resource "aws_security_group" "example" {
name = "example" # can use expressions here
dynamic "ingress" {
for_each = var.service_ports
iterator = "service_port"
content {
from_port = service_port.value
to_port = service_port.value
protocol = "tcp"
}
}
}
thus renaming the temporary variable storing the element of var.service_ports in each iteration within the lambda from default name ingress to service_port. The primary added value I see in this (and likewise when I use it in Groovy for Jenkins Pipeline libraries) is to provide a more clear name for the temporary variable storing the value to improve readability.

terraform route53 resolver setup

Just been trying to use the new terraform aws_route53_resolver_endpoint resource. It takes the subnet ids as a block type list. Unfortunately there appears to be no way to populate this from a list of subnets read from an output variable from the previous step.
Basically I have a set of subnets created using the count on the subnet resources in a previous step. Im trying to use these and setup aws_route53_resolver_endpoint in each of these subnets:
resource "null_resource" "management_subnet_list" {
count = "${length(var.subnet_ids)}"
triggers {
subnet_id = "${element(data.terraform_remote_state.app_network.management_subnet_ids, count.index)}"
}
}
resource "aws_route53_resolver_endpoint" "dns_endpoint" {
name = "${var.environment_name}-${var.network_env}-dns"
direction = "OUTBOUND"
security_group_ids = ["${var.security_groups}"]
ip_address = "${null_resource.management_subnet_list.*.triggers}"
}
The above when run, results in an error: ip_address: should be a list
If I modify the code as follow:
ip_address = ["${null_resource.management_subnet_list.*.triggers}"]
I get the error: ip_address: attribute supports 2 item as a minimum, config has 1 declared
I can't seem to figure out any other way to create the resource list dynamically from a list of subnets.
Any help will be appreciated.
Per the resource reference for aws_route53_resolver_endpoint, the subnet_id in the ip_address block is a single string value.
To specify multiple subnets, you need to have multiple ip_address blocks.
Since you state that you're creating subnets with a count argument, you could potentially reference each individually with the index like: aws_subnet.main[0].id, aws_subnet.main[1].id and so on, each in it's own ip_address block. (or for Terraform 0.11, I think it was "${aws_subnet.main.0.id}".)
However, a better way would be to use the Dynamic Blocks available in Terraform 0.12 +
Dynamic Blocks allow you to create repeatable nested blocks within top-level blocks.(resource, data, provider, and provisioner blocks currently support dynamic blocks).
A dynamic ip_address block within the aws_route53_resolver_endpoint resource could look like:
dynamic "ip_address" {
for_each = aws_subnet.main[*].id
iterator = subnet
content {
subnet_id = subnet.value
}
}
Which would result in a separate ip_address nested block for each subnet created in the aws_subnet.main resource.
The for_each argument is the complex value to iterate over. It accepts accepts any collection or structural value, typically a list or map with one element per desired nested block.
For complete info on the dynamic nested block expression, see the Terraform documentation at: https://www.terraform.io/docs/language/expressions/dynamic-blocks.html

declare a variable using `execute` Interpolation in Terraform

I want to declare a a sub-string of a variable to another variable. I tested taking a sub-string using terraform console.
> echo 'element(split (".", "10.250.3.0/24"), 2)' | terraform console
> 3
my subnet is 10.250.3.0/24 and I want my virtual machine to get private IP address within this subnet mask 10.250.3.6. I want this to get automatically assign by looking at subnet address. What I've tried;
test.tf
variable subnet {
type = "string"
default = "10.250.3.0/24"
description = "subnet mask myTestVM will live"
}
variable myTestVM_subnet {
type = "string"
default = "10.250." ${element(split(".", var.trusted_zone_onpremises_subnet), 2)} ".6"
}
And then I test it by
terraform console
>Failed to load root config module: Error parsing /home/anum/test/test.tf: At 9:25: illegal char
I guess its just simple syntax issue. but couldn't figure out what!
As you've seen, you can't interpolate the values of variables in Terraform.
You can, however, interpolate locals instead and use those if you want to avoid repeating yourself anywhere.
So you could do something like this:
variable "subnet" {
type = "string"
default = "10.250.3.0/24"
description = "subnet mask myTestVM will live"
}
locals {
myTestVM_subnet = "10.250.${element(split(".", var.trusted_zone_onpremises_subnet), 2)}.6"
}
resource "aws_instance" "instance" {
...
private_ip = "${local.myTestVM_subnet}"
}
Where the aws_instance is just for demonstration and could be any resource that requires/takes an IP address.
As a better option in this specific use case though you could use the cidrhost function to generate the host address in a given subnet.
So in your case you would instead have something like this:
resource "aws_instance" "instance" {
...
private_ip = "${cidrhost(var.subnet, 6)}"
}
Which would create an AWS instance with a private IP address of 10.250.3.6. This can then make it much easier to create a whole series of machines that increment the IP address used by using the count meta parameter.
Terraform doesn't allows interpolations declaration of variables in default. So I get ;
Error: variable "myTestVM_subnet": default may not contain interpolations
and the syntax error really got fixed after banging my head, so here is what Terraform likes;
private_ip_address = "10.250.${element(split(".", "${var.subnet}"), 2)}.5"

How to approach repeatable items in Terraform

Say that I need to provision a large number of vpc subnets in terraform. Each subnet has a cidr, a name and a availability zone. So in other config management tools I'd do something like:
[
{
"name":"subnet1",
"cidr":"10.0.0.1/24",
"az":"us-west-1a"
},
{
"name":"subnet2",
"cidr":"10.0.0.2/24",
"az":"us-west-1b"
}
]
And then iterate over that array.
Terraform doesn't have a notion of arrays/objects as far as I can see. So, for arrays of single attributes I would just use a list item:
subnets: ["10.0.0.1/24","10.0.0.2/24"]
But that doesn't allow me to name or place the subnets where I want.
I know that I can also use multiple lists in Terraform, something like:
subnet_names: ["subnet1", "subnet2"]
subnets: ["10.0.0.1/24","10.0.0.2/24"]
subnet_az: ["us-west-1a", "us-west-1b"]
But that strikes me as messy and counter-intuitive. The last option I see is to mash everything togehter into an ugly list of strings, and then split them apart in Terraform:
things: ["subnet1__10.0.0.1/24__us-west-1a","subnet2__10.0.0.2/24__us-west-2a"]
But thats just ugly.
How can I deal with array/object-type of repeats in Terraform? For now I've just explicitly defined all my things, which caused a simple vpc definition to be 300 lines long :-(
As you've seen, at present Terraform doesn't support lists of structured data like you're trying to create here.
Having multiple flat lists of strings as you showed in your question is one common solution to this problem. It works, but as you've seen it's somewhat counter-intuitive to keep track of which values belong together that way.
An alternative approach that is likely to produce a more readable and maintainable result is to factor your aws_subnet resource out into a module that takes care of the elements that are always the same for all subnets. Then you can instantiate the module once per subnet, providing only the values that vary:
module "subnet1" {
source = "./subnet"
name = "subnet1"
cidr = "10.0.0.1/24"
az = "us-west-1a"
}
module "subnet2" {
source = "./subnet"
name = "subnet2"
cidr = "10.0.0.2/24"
az = "us-west-1b"
}
In many cases there's some sort of systematic relationship between AZs and CIDR blocks. If that's true for you then you can also use your module to encode these numbering rules. For example, in your subnet module:
variable "region_network_numbers" {
default = {
"us-west-1" = 0
"us-east-1" = 1
"us-west-2" = 2
}
}
variable "az_network_numbers" {
default = {
a = 1
b = 2
}
}
variable "base_cidr_block" {
default = "10.0.0.0/8"
}
variable "az" {
}
data "aws_availability_zone" "selected" {
name = "${var.az}"
}
resource "aws_subnet" "main" {
cidr_block = "${cidrsubnet(cidrsubnet(var.base_cidr_block, 8, var.region_network_numbers[data.aws_availability_zone.selected.region]), 4, var.az_network_numbers[data.aws_availability_zone.selected.name_suffix])}"
# ...
}
With this it's sufficient to provide just the az argument to the module, with the cidr and name produced systematically from the AZ name. This is the same general idea as shown in the example for the aws_availability_zone data source, and there's a more complete, elaborate example of this in the Terraform repository itself.

Resources