Using variables when specifying location of the module in Terraform - terraform

I am trying to run this code:
locals {
terraform_modules_git = "git::ssh://....#vs-ssh.visualstudio.com/v3/...../terraform-modules"
terraform_modules_module = "resource_group?ref=v15.0.0"
}
module "MyModuleCall" {
source = "${local.terraform_modules_git}/${local.terraform_modules_module}"
}
My goal was to consolidate all Tag references in one place and not duplicate long string with the name of the repo with all my modules numerous times.
And I get this error:
Error: Variables not allowed
on main.tf line 12, in module "MyModuleCall":
12: source = "${local.terraform_modules_git}/${local.terraform_modules_module}"
Variables may not be used here.
Does anybody know why they have put this limitation? What is wrong with using variables?
Does anybody see any work around?

You can't dynamically generate source. You must explicitly hardcode it, as explained in the docs:
This value must be a literal string with no template sequences; arbitrary expressions are not allowed.
Sadly, I'm not aware of any workaround, except pre-processing templates before using them. The pre-processing would just find and replace the source with what you want.

Dependencies in Terraform are handled statically before executing the program, because the runtime needs to have access to all of the involved code (in Terraform's case, both modules and providers) before it can create a runtime context in order to execute any code. This is similar to most other programming languages, where you'd typically install dependencies using a separate command like pip install or npm install or go get before you can run the main program. In Terraform's case, the dependency installer is terraform init, and "running the program" means running terraform plan or terraform apply.
For this reason, Terraform cannot and does not permit dynamically-constructed module or provider source addresses. If you need to abstract the physical location and access method of your modules from the address specified in calling modules then one option is to use a module registry to tell Terraform how to map a local address like yourcompany.example.com/yourteam/resource-group/azure to the more complex Git URL that Terraform will actually use to fetch it.
However, in practice most teams prefer to specify their Git URLs directly because it results in a simpler overall system, albeit at the expense of it being harder to move your modules to a new location at a later date. A compromise between these two is to use a hosted service which provides Terraform Registry services, such as Terraform Cloud, but of course that comes at the expense of introducing another possible point of failure into your overall system.

Related

Optimize retrieval of multiple git-sourced terraform modules from the same repo

We have terraform code in a git repo that references custom modules in another private repo:
module "myModule" {
source = "git::https://mygiturl//some/module"
bla bla bla...
}
When we reference multiple modules that live in the same git repo, terraform init will go and clone the same git repo repeatedly for every module reference. In the end, it takes minutes to do something that would take seconds if the same repo were not cloned repeatedly into different folders.
What options do we have for optimizing the module retrieval for speed?
The terraform init command does include an optimization where it tries to recognize if a module has a module package address that matches a module that was already installed, and if so it will try to copy the existing content already cached on local disk rather than retrieving the content over the network a second time.
In order for that to work though, all of the modules must have the same package address. The "package address" is the part of the address which tells Terraform what "package" (repository, archive) it should download, as opposed to which directory inside that package it should look in to find the module's .tf files.
If you are specifying particular subdirectories inside a single repository then you are presumably already using the Modules in Package Sub-directories syntax where the package name is separated from the subdirectory path using a pair of slashes //, giving a source address like this:
module "example" {
source = "git::https://example.com/foo/bar.git//path/to/directory"
}
In the above, the package address is git::https://example.com/foo/bar.git and the subdirectory path is path/to/directory. It's the package address portion that needs to match across multiple module calls in order for Terraform to detect this opportunity for optimization.
Another option, if your goal is to have everything in a single repository anyway, is to use only relative paths starting with ../ and ./ in your module source addresses.
When you specify a local path, Terraform understands it as referring to another directory within the same module package as the caller, and so Terraform doesn't need to download anything else or create any local copies in order to create a unique directory for that call.
This approach does assume that you want to have everything in a single repository. If you have a hybrid approach where some modules are isolated into separate repositories but others are kept together in a large repository then that is a design pattern that Terraform's module installer is not designed to support well.
If the installer optimization isn't sufficient and you cannot use a single repository for everything then the only remaining option would be to split your modules across multiple smaller packages. A Git repository is one example of a "package", but you can also potentially add a level of indirection by adding a CI process to your repository which packages up the modules into separate packages and publishes those packages somewhere else that Terraform can install from, such as .zip files in an Amazon S3 bucket.
Terraform does not offer a way to share the same local directory between multiple module packages because modules are sometimes written in a way that causes them to modify their own source directory during execution (not a recommended pattern, but still possible) and in that case the module is likely to misbehave if multiple instances of it are trying to work in the same directory.

Why is terraform creating a resource not included with --target argument?

I often need to run Terraform on a few resources only, while ignoring other parts of its plan. There's a feature for that, the --target argument. However, quite often, when I use it, some resources that were not included with the --target argument still appear.
For example, I targeted resources of a remote_execute in my apply command, but terraform also included the creation of some NICs and VMs that are missing. Those are a part of the definition, but I just don't want (and can't) create them right now.
Why (and how) do they "sneak in" to the plan? And is there a way to prevent it?
The -target argument instructs Terraform to include the specified objects and anything they depend on, because otherwise this would violate the dependency relationship.
This includes both explicit dependencies written with depends_on, implicit dependencies given by just referring to another object, and also some special sorts of dependencies that Terraform generates for itself such as the dependency between a resource and its associated provider configuration.

Organising folders in terraform

I have a terraform setup in which i am creating resources in aws, i am using s3, ec2 and also kubernetes. For kubernetes i have more than 5 .tf files. I have created a folder called kube-aws and placed the .tf files there. Right now i have a setup like below
scripts/
|
s3.tf
ec2.tf
kube-aws/
|
web-deploy.tf
web-service.tf
app-deploy.tf
app-service.tf
Is this a right approach, will terraform pick the .tf files from kubw-aws folder as well? or should i do anything else to make this work.
The resources in kube-aws directory will not be included when scripts is your working directory. The scripts directory is considered the root module in this instance (see Modules documentation):
The .tf files in your working directory when you run terraform plan or terraform apply together form the root module.
You have two options to include kube-aws resources:
Move them up to the scripts directory.
Create a module block in one of the scripts/*.tf files and pass in required variables.
For example, in, say, s3.tf:
module "kube_aws" {
source = "./kube-aws"
// pass in your variables here
}
The choice you make is entirely up to you but the guidance in when to write a module is pretty persuasive:
We do not recommend writing modules that are just thin wrappers around single other resource types. If you have trouble finding a name for your module that isn't the same as the main resource type inside it, that may be a sign that your module is not creating any new abstraction and so the module is adding unnecessary complexity. Just use the resource type directly in the calling module instead.
I would recommend option 1 above, i.e. move your .tf files into a single directory, at least until you clear about when to use a module and how to best structure them. I would also highly recommend getting acquainted with the official (and excellent) documentation on Modules and Module Composition, as well as looking at example modules in Terraform Registry and their associated source code (links to source can be found on module pages).

Is it possible to to "variableize" the source path for a module in Terraform?

Is it possible to use a variable for the module source in Terraform or Terragrunt. Using Terragrunt I know we can override the module source to a local directory but it does not seem to allow us to use a different repo.
The use case is to support a development repository and a live repository. Developers will use a different repository for development of modules than will be used for the production/live deployments.
I am familiar with using the Terragrunt approach to separate environments. We can go that route, e.g. the configurations in the live folders would point to one repo and the configurations in the dev/qa folders point to another repo.
Code Snippet:
module "s3_module" {
source = "${var.source_url}"
bucket_name = "thereoncewasakingguardinghisgardenallalone"
}
Error:
Error downloading modules: Error loading modules: error downloading 'file:///home/vagrant/code/Terraform/Examples/Lab-US-West-1/${var.source_url}': source path error: stat /home/vagrant/code/Terraform/Examples/Lab-US-West-1/${var.source_url}: no such file or directory
Terraform does not allow varying module sources like this because module installation happens during terraform init and thus must make all of its decisions statically before evaluating the main code, similarly to how dependency installation works in many other languages.
A different way to meet your goal of giving the production automation a different view of modules than other callers (such as developers) is to use Terraform's native module registry mechanism and its associated service discovery protocol.
To do this requires running a service that implements the registry protocol, which is essentially just an extra level of indirection over module sources that allows them to be decided by the remote server rather than hard-coded in the configuration. If you have your module registry running at terraform.example.com then your module source strings might look something like this:
module "s3_module" {
source = "terraform.example.com/any-namespace/s3/aws"
bucket_name = "thereoncewasakingguardinghisgardenallalone"
}
The registry protocol can return module source addresses of any type that Terraform supports, including git:: for git repositories. Therefore you could set up the registry so that the above module address leads to a normal Git repository so that it's convenient for developers.
By default, Terraform will use its service discovery protocol to find the location of the registry API for terraform.example.com. You should set up the main service discovery document to refer to the registry that would be used outside of production, in order to avoid the need for manual configuration on each developer's system.
In your production system -- where presumably Terraform is running in some sort of automation -- you can use a CLI Configuration setting to override the discovery of terraform.example.com to point to a different registry API that is more appropriate for your production environment:
# Note that this goes in the _CLI configuration_, which is *not* the
# same thing as the .tf files that describe your infrastructure.
host "terraform.example.com" {
services = {
"modules.v1" = "https://production-terraform.example.com/modules/"
}
}
With that CLI configuration in place, Terraform will interpret terraform.example.com differently and use this other registry API instead, where you can potentially arrange for it to select only packaged modules in an AWS S3 bucket, or whatever other constraint seems appropriate for production.
Terraform Cloud and Enterprise have a built-in private module registry, but you can deploy anything that talks the same protocol on your own infrastructure and use it instead if you set up the service discovery protocol correctly. HashiCorp doesn't have an official private registry you can run yourself, but there are some community implementations of the protocol and the most important parts of the protocol (listing available versions and finding a download URL) are simple enough that they can be backed by a static website served from AWS S3, or similar.

How to create a file with timestamp in `terraform`?

I use below configuration to generate a filename with timestamp which will be used in many different places.
variable "s3-key" {
default = "deploy-${timestamp()}.zip"
}
but got Error: Function calls not allowed error. How can I use timestamp for a variable?
Variable defaults in particular are constant values, but local values allow for arbitrary expressions derived from variables:
variable "override_s3_key" {
default = ""
}
locals {
s3_key = var.override_s3_key != "" ? var.override_s3_key : "deploy-${timestamp()}.zip"
}
You can then use local.s3_key elsewhere in the configuration to access this derived value.
With that said, Terraform is intended for creating long-running infrastructure objects and so including timestamps is often (but not always!) indicative of a design problem. In this particular case, it looks like using Terraform to create application artifacts for deployment, which is something Terraform can do but Terraform is often not the best tool for this sort of job.
Instead, consider splitting your build and deploy into two separate steps, where the build step is implemented using any separate tool of your choice -- possibly even just a shell script -- and produces a versioned (or timestamped) artifact in S3. Then you can parameterize your Terraform configuration with that version or timestamp to implement the "deploy" step:
variable "artifact_version" {}
locals {
artifact_s3_key = "deploy-${var.artifact_version}.zip"
}
An advantage of this separation is that by separating the versioned artifacts from the long-lived Terraform objects you will by default retain the historical artifacts, and so if you deploy and find a problem you can choose to switch back to a known-good existing artifact by just re-running the deploy step (Terraform) with an older artifact version. If you instead manage the artifacts directly with Terraform, Terraform will delete your old artifact before creating a new one, because that's Terraform's intended usage model.
There's more detail on this model in the HashiCorp guide Serverless Applications with AWS Lambda and API Gateway. You didn't say that the .zip file here is destined for Lambda, but a similar principle applies for any versioned artifact. This is analogous to the workflow for other deployment models, such as building a separate Docker image or AMI for each release; in each case, Terraform is better employed for the process of selecting an existing artifact built by some other tool rather than for creating those artifacts itself.

Resources