Terragrunt and common variables - terraform

I'm trying to something fairly simple, but can't seem to get my head around it. I have the following structure:
- terragrunt.hcl
-----dummy/
---------main.tf
---------terragrunt.hcl
I'm looking to set some common variables at the root level, and use them in main.tf. Howe would I go about declaring the varibale in the root terragrunt level, and have them available downstream?
I've tried setting them as inputs in the root, but then have to explicitly declare "variables" at the dummy level for the inputs to get picked up. I'm looking to somehow define these things at the root level and not repeat variable declarations at dummy/ level. Is this doable?

You can indeed do this documented here:
https://terragrunt.gruntwork.io/docs/reference/built-in-functions/#read_terragrunt_config
You can merge all inputs defined in some file above any module.
From the docs:
read_terragrunt_config(config_path, [default_val]) parses the terragrunt config at the given path and serializes the result into a map that can be used to reference the values of the parsed config. This function will expose all blocks and attributes of a terragrunt config.
For example, suppose you had a config file called common.hcl that contains common input variables:
inputs = {
stack_name = "staging"
account_id = "1234567890"
}
You can read these inputs in another config by using read_terragrunt_config, and merge them into the inputs:
locals {
common_vars = read_terragrunt_config(find_in_parent_folders("common.hcl"))
}
inputs = merge(
local.common_vars.inputs,
{
# additional inputs
}
)
This function also takes in an optional second parameter which will be returned if the file does not exist:
locals {
common_vars = read_terragrunt_config(find_in_parent_folders("i-dont-exist.hcl", "i-dont-exist.hcl"), {inputs = {}})
}
inputs = merge(
local.common_vars.inputs, # This will be {}
{
# additional inputs
}
)

Per the Terragrunt documentation: "Currently you can only reference locals defined in the same config file. Terragrunt does not automatically include locals defined in the parent config of an include block into the current context."
However, one way you can do this is as follows:
Create a file containing the common variables (e.g. myvars.hcl)
Load that in the child terragrunt:
locals {
myvars = read_terragrunt_config(find_in_parent_folders("myvars.hcl"))
foo = local.myvars.locals.foo
}
Hope that helps!

Other tools like Ansible has directory hierarchy where child can refer to or override the value of a variable set at the parent level.
Terraform does not have such a mechanism and each directory having tf files is a separate Terraform module. So directory hierarchy cannot be used to pass/inherit/reference Terraform variables.
Perhaps better to let the idea of "downstream or upstream" go.
One way to define common variables and share them among other modules is Data-only Modules . Extension of this and make the common variable world-wide available is using Terraform registry although it is not the intended use.

Related

Does Terraform locals visibility scope span children modules?

I've found that I can access a local coming from my root Terraform module in its children Terraform modules.
I thought that a local is scoped to the very module it's declared in.
See: https://developer.hashicorp.com/terraform/language/values/locals#using-local-values
A local value can only be accessed in expressions within the module where it was declared.
Seems like the documentation says locals shouldn't be visible outside their module. At my current level of Terraform knowledge I can't foresee what may be wrong with seeing locals of a root module in its children.
Does Terraform locals visibility scope span children (called) modules?
Why is that?
Is it intentional (by design) that a root local is visible in children modules?
Details added later:
Terraform version I use 1.1.5
My sample project:
.
├── childmodulecaller.tf
├── main.tf
└── child
└── some.tf
main.tf
locals {
a = 1
}
childmodulecaller.tf
locals {
b = 2
}
module "child" {
for_each = toset(try(local.a + local.b == 3, false) ? ["name"] : [])
source = "./child"
}
some.tf
resource "local_file" "a_file" {
filename = "${path.module}/file1"
content = "foo!"
}
Now I see that my question was based on a wrongly interpreted observation.
Not sure if it is still of any value but leaving it explained.
Perhaps it can help someone else to understand the same and avoid the confusion I experienced and explained in my corrected answer.
Each module has an entirely distinct namespace from others in the configuration.
The only way values can pass from one module to another is using input variables (from caller to callee) or output values (from callee to caller).
Local values from one module are never automatically visible in another module.
EDIT: Corrected answer
After reviewing my sample Terraform project code I see that my finding was wrong. The local a from main.tf I access in childmodulecaller.tf is actuallly accessed in a module block but still in the scope of my root module (I understand that is because childmodulecaller.tf is directly in the root module config dir). So I confused a module block in a calling parent with the child module called.
My experiments like changing child/some.tf the following way:
resource "local_file" "a_file" {
filename = "${path.module}/file1"
content = "foo!"
}
output "outa" {
value = local.a
}
output "outb" {
value = local.b
}
cause Error: Reference to undeclared local value
on terraform validate issued (similarly to what Mark B already mentioned in question comments for Terraform version 1.3.0)
So no, Terraform locals scope don't span children (called) modules.
Initial wrong answer:
I think I've understood why locals are visible in children modules.
It's because children (called) modules are included into the configuration of root (parent) module.
To call a module means to include the contents of that module into the configuration with specific values for its input variables.
https://developer.hashicorp.com/terraform/language/modules/syntax#calling-a-child-module
So yes, it's by design and not a bug. Just it may be not clear from locals documentation. As root (parent) module's locals visible in children module parts of configuration which are essentially also parts of the root (parent) modules being included into the root (parent) module.

Locals depends_on - Terraform

I have a module a in terraform which creates a text file , i need to use that text file in another module b, i am using locals to pull the content of that text file like below in module b
locals {
ports = split("\n", file("ports.txt") )
}
But the terraform expects this file to be present at the start itself, throws error as below
Invalid value for "path" parameter: no file exists at
path/ports.txt; this function works only with files
that are distributed as part of the configuration source code, so if this file
will be created by a resource in this configuration you must instead obtain
this result from an attribute of that resource.
What am i missing here? Any help on this would be appreciated. Is there any depends_on for locals, how can i make this work
Modules are called from within other modules using module blocks. Most arguments correspond to input variables defined by the module. To reference the value from one module, you need to declare the output in that module, then you can call the output value from other modules.
For example, I suppose you have a text file in module a.
.tf file in module a
output "textfile" {
value = file("D:\\Terraform\\modules\\a\\ports.txt")
}
.tf file in module b
variable "externalFile" {
}
locals {
ports = split("\n", var.externalFile)
}
# output "b_test" {
# value = local.ports
# }
.tf file in the root module
module "a" {
source = "./modules/a"
}
module "b" {
source = "./modules/b"
externalFile = module.a.textfile
depends_on = [module.a]
}
# output "module_b_output" {
# value = module.b.b_test
# }
For more reference, you could read https://www.terraform.io/docs/language/modules/syntax.html#accessing-module-output-values
As the error message reports, the file function is only for files that are included on disk as part of your configuration, not for files generated dynamically during the apply phase.
I would typically suggest avoiding writing files to local disk as part of a Terraform configuration, because one of Terraform's main assumptions is that any objects you manage with Terraform will persist from one run to the next, but that could only be true for a local file if you always run Terraform in the same directory on the same computer, or if you use some other more complex approach such as a network filesystem. However, since you didn't mention why you are writing a file to disk I'll assume that this is a hard requirement and make a suggestion about how to do it, even though I would consider it a last resort.
The hashicorp/local provider includes a data source called local_file which will read a file from disk in a similar way to how a more typical data source might read from a remote API endpoint. In particular, it will respect any dependencies reflected in its configuration and defer reading the file until the apply step if needed.
You could coordinate this between modules then by making the output value which returns the filename also depend on whichever resource is responsible for creating the file. For example, if the file were created using a provisioner attached to an aws_instance resource then you could write something like this inside the module:
output "filename" {
value = "D:\\Terraform\\modules\\a\\ports.txt"
depends_on = [aws_instance.example]
}
Then you can pass that value from one module to the other, which will carry with it the implicit dependency on aws_instance.example to make sure the file is actually created first:
module "a" {
source = "./modules/a"
}
module "b" {
source = "./modules/b"
filename = module.a.filename
}
Then finally, inside the module, declare that input variable and use it as part of the configuration for a local_file data resource:
variable "filename" {
type = string
}
data "local_file" "example" {
filename = var.filename
}
Elsewhere in your second module you can then use data.local_file.example.content to get the contents of that file.
Notice that dependencies propagate automatically aside from the explicit depends_on in the output "filename" block. It's a good practice for a module to encapsulate its own behaviors so that everything needed for an output value to be useful has already happened by the time a caller uses it, because then the rest of your configuration will just get the correct behavior by default without needing any additional depends_on annotations.
But if there is any way you can return the data inside that ports.txt file directly from the first module instead, without writing it to disk at all, I would recommend doing that as a more robust and less complex approach.

Terraform - why this is not causing circular dependency?

Terraform registry AWS VPC example terraform-aws-vpc/examples/complete-vpc/main.tf has the code below which seems to me a circular dependency.
data "aws_security_group" "default" {
name = "default"
vpc_id = module.vpc.vpc_id
}
module "vpc" {
source = "../../"
name = "complete-example"
...
# VPC endpoint for SSM
enable_ssm_endpoint = true
ssm_endpoint_private_dns_enabled = true
ssm_endpoint_security_group_ids = [data.aws_security_group.default.id] # <-----
...
data.aws_security_group.default refers to "module.vpc.vpc_id" and module.vpc refers to "data.aws_security_group.default.id".
Please explain why this does not cause an error and how come module.vpc can refer to data.aws_security_group.default.id?
In the Terraform language, a module creates a separate namespace but it is not a node in the dependency graph. Instead, each of the module's Input Variables and Output Values are separate nodes in the dependency graph.
For that reason, this configuration contains the following dependencies:
The data.aws_security_group.default resource depends on module.vpc.vpc_id, which is specifically the output "vpc_id" block in that module, not the module as a whole.
The vpc module's variable "ssm_endpoint_security_group_ids" variable depends on the data.aws_security_group.default resource.
We can't see the inside of the vpc module in your question here, but the above is okay as long as there is no dependency connection between output "vpc_id" and variable "ssm_endpoint_security_group_ids" inside the module.
I'm assuming that such a connection does not exist, and so the evaluation order of objects here would be something like this:
aws_vpc.example in module.vpc is created (I just made up a name for this because it's not included in your question)
The output "vpc_id" in module.vpc is evaluated, referring to module.vpc.aws_vpc.example, and producing module.vpc.vpc_id.
data.aws_security_group.default in the root module is read, using the value of module.vpc.vpc_id.
The variable "ssm_endpoint_security_group_ids" for module.vpc is evaluated, referring to data.aws_security_group.default.
aws_vpc_endpoint.example in module.vpc is created, including a reference to var.ssm_endpoint_security_group_ids.
Notice that in all of the above I'm talking about objects in modules, not modules themselves. The modules serve only to create separate namespaces for objects, and then the separate objects themselves (which includes individual variable and output blocks) are what participate in the dependency graph.
Normally this design detail isn't visible: Terraform normally just uses it to potentially optimize concurrency by beginning work on part of a module before the whole module is ready to process. In some interesting cases like this though, you can also intentionally exploit this design so that an operation for the calling module can be explicitly sandwiched between two operations for the child module.
Another reason why we might make use of this capability is when two modules naturally depend on one another, such as in an experimental module I built that hides some of the tricky details of setting up VPC peering connections:
locals {
vpc_nets = {
us-west-2 = module.vpc_usw2
us-east-1 = module.vpc_use1
}
}
module "peering_usw2" {
source = "../../modules/peering-mesh"
region_vpc_networks = local.vpc_nets
other_region_connections = {
us-east-1 = module.peering_use1.outgoing_connection_ids
}
providers = {
aws = aws.usw2
}
}
module "peering_use1" {
source = "../../modules/peering-mesh"
region_vpc_networks = local.vpc_nets
other_region_connections = {
us-west-2 = module.peering_usw2.outgoing_connection_ids
}
providers = {
aws = aws.use1
}
}
(the above is just a relevant snippet from an example in the module repository.)
In the above case, the peering-mesh module is carefully designed to allow this mutual referencing, internally deciding for each pair of regional VPCs which one will be the peering initiator and which one will be the peering accepter. The outgoing_connection_ids output refers only to the aws_vpc_peering_connection resource and the aws_vpc_peering_connection_accepter refers only to var.other_region_connections, and so the result is a bunch of concurrent operations to create aws_vpc_peering_connection resources, followed by a bunch of concurrent operations to create aws_vpc_peering_connection_accepter resources.

setting value of variable terraform in tfvars file for nested structure

terraform has adjusted its authorization
in main.tf [for sql config] I now have:
resource "google_sql_database_instance" "master" {
name = "${random_id.id.hex}-master"
region = "${var.region}"
database_version = "POSTGRES_9_6"
# allow direct access from work machines
ip_configuration {
authorized_networks = "${var.authorized_networks}"
require_ssl = "${var.sql_require_ssl}"
ipv4_enabled = true
}
}
where
in variables.tf I have
variable "authorized_networks" {
description = "The networks that can connect to cloudsql"
type = "list"
default = [
{
name = "work"
value = "xxx.xxx.xx.xxx/32"
}
]
}
where xxx.xxx.xx.xxx is the ip address I would like to allow. However, I prefer not to put this in my variables.tf file, but rather in a non-source controlled .tfvars file.
for variables that have a simple value, this is easy, but it is not clear to me how to do it with the nested structure. Replacing xxx.xxx.xx.xxx by a variable [e.g. var.work_ip] leads to an error
variables may not be used here
any insights?
If you omit the default argument in your main configuration altogether, you will mark variable "authorized_networks" as a required input variable, which Terraform will then check to ensure that it is set by the caller.
If this is a root module variable, then you can provide the value for it in a .tfvars file using the following syntax:
authorized_networks = [
{
name = "work"
value = "xxx.xxx.xx.xxx/32"
}
]
If this file is being generated programmatically by some wrapping automation around Terraform, you can also write it into a .tfvars.json file and use JSON syntax, which is often easier to construct robustly in other languages:
{
"authorized_networks": [
{
"name": "work",
"value": "xxx.xxx.xx.xxx/32"
}
]
}
You can either specify this file explicitly on the command line using the -var-file option, or you can give it a name ending in .auto.tfvars or .auto.tfvars.json in the current working directory when you run Terraform and Terraform will then find and load it automatically.
A common reason to keep something out of version control is because it's a dynamic setting configured elsewhere in the broader system rather than a value fixed in version control. If that is true here, then an alternative strategy is to save that setting in a configuration data store that Terraform is able to access via data sources and then write your Terraform configuration to retrieve that setting directly from the place where it is published.
For example, if the network you are modelling here were a Google Cloud Platform subnetwork, and it has either a fixed name or one that can be derived systematically in Terraform, you could retrieve this setting using the google_compute_subnetwork data source:
data "google_compute_subnetwork" "work" {
name = "work"
}
Elsewhere in configuration, you can then use data.google_compute_subnetwork.work.ip_cidr_range to access the CIDR block definition for this network.
The major Terraform providers have a wide variety of data sources like this, including ones that retrieve specific first-class objects from the target platform and also more generic ones that access configuration stores like AWS Systems Manager Parameter Store or HashiCorp Consul. Accessing the necessary information directly or publishing it "online" in a configuration store can be helpful in a larger system to efficiently integrate subsystems.

define keystone_user from openstack/puppet-keystone via hiera?

I am using https://github.com/openstack/puppet-keystone to set up an OpenStack management/controller node. I need to add the 'glance' user to keystone. I want to try and do as much as I can in my hiera data so my manifest will be simple.
Here is my manifest:
class kilo2_keystone {
include controller_ceph
include keystone
include keystone::config
include keystone::user
# keystone_user { 'glance':
# ensure => present,
# }
}
The commented out section works, but I want to be able to do include keystone::user and supply the parameters in my hiera data like so:
keystone::user:
"%{hiera('glance_admin_user')}":
ensure: present
But when I run puppet agent -t on my node I get this error:
Could not find class ::keystone::user
The commented-out code declares a resource of type keystone_user, not a class. Presumably its type, keystone_user, is provided by the puppet-keystone module. The include() family of functions are for declaring classes, not resources, so they are inapplicable to keystone_user.
There is more than one way you could proceed. If you don't anticipate wanting to anything more complicated than declaring one or more keystone_users present, then I'd recommend giving your class a parameter for the user name(s), to which you can assign a value via Hiera:
class kilo2_keystone($usernames = []) {
include controller_ceph
include keystone
include keystone::config
keystone_user { $usernames:
ensure => present,
}
}
On the other hand, if you want to be able to declare multiple users, each with its own set of attributes, then the create_resources() function is probably the path of least resistance. You still want to parameterize your class so that it gets the data from Hiera via automated data binding, but now you want the data to be structured differently, as described in the create_resources() docs: as a hash mapping resource titles (usernames, in your case) to inner hashes of resource parameters to corresponding values.
For example, your class might look like this:
class kilo2_keystone($userdata = {}) {
include controller_ceph
include keystone
include keystone::config
create_resources('keystone_user', $userdata)
}
The corresponding data for this class might look like this:
kilo2_keystone::userdata:
glance:
ensure: present
enabled: true
another_user:
ensure: absent
Note also that you are placing your kilo2_keystone class in the top scope. You really ought to put it in a module and assign it to that module's namespace. The latter would look like this:
class mymodule::kilo2_keystone($userdata = {}) {
# ...
}

Resources