How to specify Terraform remote state backend config via CLI - terraform

In Terraform you can specify a remote state data source to access output values from another terraform configuration.
This requires a backend configuration, as follows:
data "terraform_remote_state" "vpc" {
backend = "remote"
config = {
organization = "hashicorp"
workspaces = {
name = "vpc-prod"
}
}
}
We do not commit backend configuration in the code, it is specified by CLI parameters. Is it possible to also specify the remote state backend configuration via CLI?

The backend config of a terraform_remote_state resource can take variables like any other vanilla terraform block. You can then override them via CLI in the normal way.
data "terraform_remote_state" "vpc" {
backend = "remote"
config = {
organization = var.org_name
workspaces = {
name = var.ws_name
}
}
}
terraform apply -var="org_name=hashicorp" -var="ws_name=vpc-prod"

Yes it can be possible to specify the remote state backend configuration via CLI with Partial Configuration.
You do not need to specify every required argument in the backend
configuration. Omitting certain arguments may be desirable if some
arguments are provided automatically by an automation script running
Terraform. When some or all of the arguments are omitted, we call this
a partial configuration.
With a partial configuration, the remaining configuration arguments
must be provided as part of the initialization process.
Please check the Command-line key/value pairs of this Document for how to pass the backend configuration content using CLI

Related

Running 'terragrunt apply' on an EC2 Instance housed in a No Internet Environment

I have been trying to set up my Terragrunt EC2 environment in a no/very limited internet setting.
Current Setup:
AWS network firewall that whitelists domains to allow traffic, and most internet traffic is blocked excepted a few domains.
EC2 instance where I run the terragrunt code, it has an instance profile that can assume the role in providers
VPC endpoints set up for sts, s3, dynamodb, codeartifact etc
All credentials (assumed role etc) work and have been verified
Remote State and Providers File
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "***"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "ap-southeast-1"
encrypt = true
dynamodb_table = "***"
}
}
# Dynamically changes the role depending on which account is being modified
generate "providers" {
path = "providers.tf"
if_exists = "overwrite"
contents = <<EOF
provider "aws" {
region = "${local.env_vars.locals.aws_region}"
assume_role {
role_arn = "arn:aws:iam::$***"
endpoints {
sts = "https://sts.ap-southeast-1.amazonaws.com"
s3 = "https://s3.ap-southeast-1.amazonaws.com"
dynamodb = "https://dynamodb.ap-southeast-1.amazonaws.com"
}
}
EOF
}
With Internet (Turning off the firewall):
I am able to run all the terragrunt commands
Without Internet
I only allow "registry.terraform.io" to pass the firewall
I am able to assume the role listed in providers via aws sts assume-role, and I can list the tables in dynamodb and files in the s3 bucket
I am able to run terragrunt init on my EC2 instance with the instance profile, I assume terragrunt does use the correct sts_endpoint
However when I run terragrunt apply, it hangs at the stage `DEBU[0022] Running command: terraform plan prefix=[***]
In my CloudTrail I do see that Terragrunt has assumed the username aws-go-sdk-1660077597688447480 for the event GetCallerIdentity, so I think the provider is able to assume the role that was declared in the providers block
I tried adding custom endpoints for sts, s3, and dynamodb, but it still hangs.
I suspect that terraform is still trying to use the internet when making the AWS SDK calls, which leads to terragrunt apply being stuck.
Is there a comprehensive list of endpoints I need to custom add, or a list of domains I should whitelist to be able to run terragrunt apply?
I set the environment variable TF_LOG to debug, and besides the registry.terraform.io domain, I was able to gather these ones:
github.com
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/go-tfe v1.0.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/hcl/v2 v2.12.0
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-config-inspect v0.0.0-20210209133302-4fd17a0faac2
2022-08-18T15:33:03.106-0600 [DEBUG] using github.com/hashicorp/terraform-svchost v0.0.0-20200729002733-f050f53b9734
sts.region.amazonaws.com
resource.region.amazonaws.com
You'll want to add those domains to your whitelist in the firewall settings, something like *.region.amazonaws.com should do the trick, of course, you can be more restrictive, and rather than use a wildcard, you can specify the exact resource.
For reference: https://docs.aws.amazon.com/general/latest/gr/rande.html

Terraform cloud config dynamic workspace name

I'm building CI/CD pipeline using GitHub Actions and Terraform. I have a main.tf file like below, which I'm calling from GitHub action for multiple environments. I'm using https://github.com/hashicorp/setup-terraform to interact with Terraform in GitHub actions. I have MyService component and I'm deploying to DEV, UAT and PROD environments. I would like to reuse main.tf for all of the environments and dynamically set workspace name like so: MyService-DEV, MyService-UAT, MyService-PROD. Usage of variables is not allowed in the terraform/cloud block. I'm using HashiCorp cloud to store state.
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 2.0"
}
}
cloud {
organization = "tf-organization"
workspaces {
name = "MyService-${env.envname}" #<==not allowed to use variables
}
}
}
Update
I finally managed to get this up and running with helpful comments. Here are my findings:
TF_WORKSPACE needs to be defined upfront like: service-dev
I didn't get tags to work the way I want when running in automation. If I define a tag in cloud.workspaces.tags as 'service' then there is no way to set a second tag like 'dev' dynamically. Both of the tags are needed to during init ['service', 'dev'] in order for TF to select workspace service-dev automatically.
I ended up using tfe provider in order to set up workspaces(with tags) automatically. In the end I still needed to set TF_WORKSPACE=service-dev
It doesn't make sense to refer to terraform.workspace as part of the workspaces block inside a cloud block, because that block defines which remote workspaces Terraform will use and therefore dictates what final value terraform.workspace will have in the rest of your configuration.
To declare that your Terraform configuration belongs to more than one workspace in Terraform Cloud, you can assign each of those workspaces the tag "MyService" and then use the tags argument instead of the name argument:
cloud {
organization = "tf-organization"
workspaces {
tags = ["MyService"]
}
}
If you assign that tag to hypothetical MyService-dev and MyService-prod workspaces in Terraform Cloud and then initialize with the configuration above, Terraform will present those two workspaces for selection using the terraform workspace commands when working in this directory.
terraform.workspace will then appear as either MyService-dev or MyService-prod, depending on which one you have selected.

Terraform: How can I reference Terraform cloud environmental variables?

I'm using Terraform cloud. I would like to take advantage of using AWS Tags with my resources. I want to tag each resource defined in Terraform with the current GIT Branch Name. That way I can separate dev from production.
Terraform has a list of Environmental Variables that do reference the GIT Branch Name with their service in the cloud as:
TFC_CONFIGURATION_VERSION_GIT_BRANCH - This is the name of the branch that the associated Terraform configuration version was ingressed from (e.g. master).
How can I reference the TFC_CONFIGURATION_VERSION_GIT_BRANCH environmental variable in the following resource for an example VPC?
resource "aws_vpc" "example_vpc" {
cidr_block = "10.0.0.0/16"
tags = {
product = var.product
stage = var.TFC_CONFIGURATION_VERSION_GIT_BRANCH
}
}
reference: https://www.terraform.io/docs/language/values/variables.html#environment-variables
I figured it out! Wish the documentation was clearer on this with the cloud.
You will have to set a empty variable. I defined mine in variables.tf as:
variable "TFC_CONFIGURATION_VERSION_GIT_BRANCH" {
type = string
default = ""
}
Per the documentation I linked in question. TFC_CONFIGURATION_VERSION_GIT_BRANCH is injected automatically into the environmental variables with each cloud run. Defining the full name of the environmental variable as the variable worked.
resource "aws_vpc" "example_vpc" {
cidr_block = "10.0.0.0/16"
tags = {
product = var.product
stage = var.TFC_CONFIGURATION_VERSION_GIT_BRANCH
}
}
Then the plan output was successful in the cloud:
Terraform will perform the following actions:
# aws_vpc.example_vpc will be updated in-place
~ resource "aws_vpc" "example_vpc" {
id = "vpc-0b19679e6464b8481"
~ tags = {
~ "stage" = "None" -> "develop"
# (1 unchanged element hidden)
}
# (14 unchanged attributes hidden)
}
Terraform Cloud serves as a remote execution environment for Terraform CLI (amongst other things) and so when you configure a workspace in Terraform Cloud many of the settings are about the context where Terraform CLI will run, and how Terraform Cloud will run it.
Part of that configuration model is the idea of environment variables, which correspond with the same environment variables you might set in your shell when running Terraform CLI locally. As with local Terraform, those environment variables are not directly usable from your configuration but are instead settings for other systems that Terraform and providers will interact with, such as the AWS_ACCESS_KEY_ID environment variable conventionally used by AWS software as a way to statically configure the access key identifier to use.
Terraform Cloud also allows you to set "Terraform Variables", and those correspond with Input Variables in the Terraform language. These are the settings for your Terraform configuration itself, as opposed to other software it will interact with, and so you can refer to these by declaring them using variable blocks and then using expressions like var.example elsewhere in the root module. Internally, Terraform Cloud is passing the configured values to Terraform CLI by generating a file called terraform.tfvars, which Terraform CLI looks for as a default source of variable values.
Both of these types of variables are useful for different purposes, and so most Terraform workspaces include a mixture of environment variables for configuring external systems and "Terraform variables" for configuring the current Terraform configuration itself.
For the benefit of folks using Terraform CLI outside of Terraform Cloud, Terraform CLI actually also offers a way to set Input Variables using environment variables, and technically you can do that within Terraform Cloud too because it's ultimately just running Terraform with environment variables set in the same way as you might locally. That's not the intended way to use Terraform Cloud, and so I'm mentioning it only for completeness because the terminology overlap here might be confusing.
Terraform cloud workspace variables can be set as category "terraform" or as category "env". In a remote execution setup, you are unable to reference workspace vars defined as "env" from your terraform code. Instead, those vars will be automatically injected into the execution environment. To be able to reference a workspace variable in terraform code, set the variable type to "terraform" but do not check the "HCL" tick. Defining a blank variable is still needed. Hope this helps someone! Unfortunately, the documentation is very unclear about this.
Terraform Cloud: Create a workspace variable
key: example_variable
value: example_value
category: terraform
Implementation: Configuration (.tf) file
Declare a blank variable
Reference your variable
# Declaring a blank variable
variable "example_variable" {}
# Referencing a variable
block "name" {
input = var.example_variable
Snowflake CI/CD Example
CI/CD pipeline for Snowflake with GitHub Actions and Terraform configured similar to Snowflake's quick start guide
Implemented a private key instead of a user password
variable "snowflake_private_key" {}
terraform {
required_providers {
snowflake = {
source = "chanzuckerberg/snowflake"
version = "0.25.17"
}
}
backend "remote" {
organization = "terraform-snowflake"
workspaces {
name = "gh-actions"
}
}
}
provider "snowflake" {
username = "snowflake-username"
account = "snowflake-account"
region = "snowflake-region"
private_key = var.snowflake_private_key
role = "SYSADMIN"
}

Terraform partial remote backend cannot contain interpolations?

I am trying to configure a Terraform enterprise workspace in Jenkins on the fly. To do this, I need to be able to set the remote backend workspace name in my main.tf dynamically. Like this:
# Using a single workspace:
terraform {
backend "remote" {
hostname = "app.xxx.xxx.com"
organization = "YYYY"
# new workspace variable
workspaces {
name = "${var.workspace_name}"
}
}
}
Now when I run:
terraform init -backend-config="workspace_name=testtest"
I get:
Error loading backend config: 1 error(s) occurred:
* terraform.backend: configuration cannot contain interpolations
The backend configuration is loaded by Terraform extremely early, before
the core of Terraform can be initialized. This is necessary because the backend
dictates the behavior of that core. The core is what handles interpolation
processing. Because of this, interpolations cannot be used in backend
configuration.
If you'd like to parameterize backend configuration, we recommend using
partial configuration with the "-backend-config" flag to "terraform init".
Is what I want to do possible with terraform?
You cann't put any variables "${var.workspace_name}" or interpolations into the Backend Remote State Store.
However, you can create a file beside with your Backend values, it could look like this into the main.tf file:
# Terraform backend State-Sotre
terraform {
backend "s3" {}
}
and into a dev.backend.tfvars for instance:
bucket = "BUCKET_NAME"
encrypt = true
key = "BUCKET_KEY"
dynamodb_table = "DYNAMODB_NAME"
region = "AWS_REGION"
role_arn = "IAM_ROLE_ARN"
You can use partial configuration for s3 Backend as well.
Hope it'll help.
Hey I found the correct way to do this:
While the syntax is a little tricky, the remote backend supports partial backend initialization. What this means is that the configuration can contain a backend block like this:
terraform {
backend "remote" { }
}
And then Terraform can be initialized with a dynamically set backend configuration like this (replacing ORG and WORKSPACE with appropriate values):
terraform init -backend-config "organization=ORG" -backend-config 'workspaces=[{name="WORKSPACE"}]'

Terraform terraform_remote_state Partial Configuration

My team relies heavily on S3 remote state from within Terraform. We use the -backend-config feature of the CLI to specify the S3 configuration when initializing projects, so our actual terraform code looks like:
terraform {
backend "s3" {}
}
The above works great as long as all the S3 attributes are specified on the CLI with -backend-config.
We would like to use a similar strategy for referencing these states elsewhere in our configurations. Since the parameters for the backend are dynamic and specified on the CLI, we are looking to do the same.
data "terraform_remote_state" "dns" {
backend = "s3"
config {
key = "configurations/production/dns/terraform.tfstate"
}
}
In the above example, we've omitted the required region and bucket parameters, which of course causes plan/apply to fail (with not a valid region:).
Is there a method by which we can specify the region and bucket for remote state references from the CLI instead of hard-coding them?
The backend block is rather special because it gets processed so early in Terraform's workflow, and thus it doesn't have access to normal Terraform features such as variables. That's why it has its own special mechanism for configuring it.
The terraform_remote_state data source, on the other hand, is just a regular data source and so any normal interpolation strategy can be used with it. To pass settings from the CLI, for example, you could use variables:
variable "dns_state_region" {
}
variable "dns_state_key" {
}
data "terraform_remote_state" "dns" {
backend = "s3"
config {
region = "${var.dns_state_region}"
key = "${var.dns_state_key}"
}
}
You can then pass these to the terraform plan command:
$ terraform plan \
-var="dns_state_region=us-west-1" \
-var="dns_state_key=configurations/production/dns/terraform.tfstate"

Resources