How to refresh remote state from another module - terraform

Looking for an advice and some general guidance with using terraform and in my case completely splitting states by services.
This is an example of directory structure how it's maintained.
global
├── iam
│   ├── .tf files
├── kms
│   ├── .tf files
└── organization
└── root
├── .tf files
├── project_1_OU
| └── .tf files
├── project_2_OU
| └── .tf files
└── project_3_OU
└── .tf files
Every state is in it's service folder. For example:
global/organization/root has the list of all accounts in the organization, controls the Resource Access Manager shares, SCPs on the global level, and etc.
One of the outputs is the following one which returns the list of all accounts.
output "root_organization_accounts" {
value = "${data.template_file.organization_accounts_list.*.rendered}"
}
KMS service on the other hand uses this output variable from the remote state to update it's policy with the list of new accounts
Scenario I'm trying:
create global/organization/root/project_x_OU
this creates the wanted OU structure under Root with sub-accounts, SCPs and etc.
accounts list is now different and needs to be refreshed (state in the level above)!
terraform apply/refresh is now needed manually to change the output of the global/organization/root (and nothing reported as to be changed which is normal in this case - it's only output change)
now, we should do the terraform apply in global/kms level so that the policy is updated with new accounts added
In short, global/organization/root state is used as read-only in global/organization/root/project_x_OU and global/kms levels.
A change in global/organization/root/project_x_OU should trigger state refresh in global/organization/root level so that global/kms level can be updated with new data.
My questions:
What would be the best way to trigger refresh on another remote state
if we know that it will change and we need it to change so another
module can be updated?
Am I doing something wrong and how?
Should I create a fake/unused variable in global/organization/root level and increment it with every new account just to force changes, is it even possible?
Any other advice and how it is done at your place?
I'm using Atlantis as a type of CI/CD solution in combination with plain terraform https://www.runatlantis.io/ and it doesn't like to perform changes if there aren't any (on global/organization/root level in this case)

Related

Using Terraform workspaces in an automation pipeline with TF_WORKSPACE: Currently selected workspace “X” does not exist

I am working on an open source project with Terraform that will allow me to set up ad-hoc environments through GitHub Actions. Each ad-hoc environment will correspond to a terraform workspace. I'm setting the workspace by exporting TF_WORKSPACE before running terraform init, plan and apply. This works the first time around. For example, I'm able to create an ad-hoc environment called alpha. In my S3 backend I can see that the state file is saved under the alpha folder. The issue is that when I run the same pipeline to create another ad-hoc environment called beta, I get the following message:
Initializing the backend...
╷
│Error: Currently selected workspace "beta" does not exist
│
│
╵
Error: Process completed with exit code 1.
Here is the section of my GitHub action that is failing: https://github.com/briancaffey/django-step-by-step/blob/main/.github/workflows/ad_hoc_env_create_update.yml#L110-L142
I have been over this article: https://support.hashicorp.com/hc/en-us/articles/360043550953-Selecting-a-workspace-when-running-Terraform-in-automation but I'm still not sure what I'm doing wrong in my automation pipeline.
The alpha workspace did not exist, but it seemed to be able to create it and use it as the workspace in my first run. I'm not sure why other workspaces are not able to be created using the same pipeline.
I got some help from #apparentlymart on the Terraform community forum. Here's the reply: https://discuss.hashicorp.com/t/help-using-terraform-workspaces-in-an-automation-pipeline-with-tf-workspace-currently-selected-workspace-x-does-not-exist/40676/2
In order to make the pipeline work I had to use terraform commands in the following order:
terraform init ...
terraform workspace create ${WORKSPACE} || echo "Workspace ${WORKSPACE} already exists or cannot be created"
export TF_WORKSPACE=$WORKSPACE
terraform apply ...
terraform output ...
This allows me to create multiple ad hoc environments without any errors. The code in my example project has also been updated with this fix: https://github.com/briancaffey/django-step-by-step/blob/main/.github/workflows/ad_hoc_env_create_update.yml#L110-L146

How can I use terraform to setup in multi regions?

I have a terraform configurations file without modules and it is hosted in production .
Currently I am trying to deploy it in another region using provider alias and modules .But when I plan the same , it says that I need to destroy my previous configuration and recreate it via modules
When I modularise the files are tehy treated by terraform as a new resource ?I have around 35 reources , it says 35 to destroy and 35 to create .In the process of modularising I removed the .terraform folder under the root and initialised it in the main module
Is this the expected behaviour?
Yes, it is the expected behaviour because the resources have now different Terraform identifiers. When Terraform runs plan or apply, it looks in the status file and cannot find the new identifiers listed there.
You can solve this situation using the terraform state mv command. I have used it successfully, but it is a tedious task and mistakes are easy. I recommend to make additional backups of your state file via -backup-out option and checking the plan after each run.
I would suggest that you look into terragrunt to make your code more dry. Once you've integrated with Terragrunt you can start using a hierarchy like this:
--root
terragrunt.hcl
|
--dev_account
account.hcl
|
--us-east-2
region.hcl
|
--services
--us-east-1
--us-west-2
--uat_account
|
--us-east-2
|
--services
--us-east-1
--us-west-2
--prod_account
|
--us-east-2
|
--services
--us-east-1
--us-west-2
Terragrunt will allow you to have one module and with a hierarchy similar to this, you'll be able to parse the account.hcl, and region.hcls to deploy to different regions without having to re-enter this information.
Terragrunt allows you to run an apply-all if you choose to apply all child terraform configurations, though I wouldn't suggest this for production.

Organising folders in terraform

I have a terraform setup in which i am creating resources in aws, i am using s3, ec2 and also kubernetes. For kubernetes i have more than 5 .tf files. I have created a folder called kube-aws and placed the .tf files there. Right now i have a setup like below
scripts/
|
s3.tf
ec2.tf
kube-aws/
|
web-deploy.tf
web-service.tf
app-deploy.tf
app-service.tf
Is this a right approach, will terraform pick the .tf files from kubw-aws folder as well? or should i do anything else to make this work.
The resources in kube-aws directory will not be included when scripts is your working directory. The scripts directory is considered the root module in this instance (see Modules documentation):
The .tf files in your working directory when you run terraform plan or terraform apply together form the root module.
You have two options to include kube-aws resources:
Move them up to the scripts directory.
Create a module block in one of the scripts/*.tf files and pass in required variables.
For example, in, say, s3.tf:
module "kube_aws" {
source = "./kube-aws"
// pass in your variables here
}
The choice you make is entirely up to you but the guidance in when to write a module is pretty persuasive:
We do not recommend writing modules that are just thin wrappers around single other resource types. If you have trouble finding a name for your module that isn't the same as the main resource type inside it, that may be a sign that your module is not creating any new abstraction and so the module is adding unnecessary complexity. Just use the resource type directly in the calling module instead.
I would recommend option 1 above, i.e. move your .tf files into a single directory, at least until you clear about when to use a module and how to best structure them. I would also highly recommend getting acquainted with the official (and excellent) documentation on Modules and Module Composition, as well as looking at example modules in Terraform Registry and their associated source code (links to source can be found on module pages).

How to force users to specify `-target=module.<environment>` when running `terraform apply`?

I have a terraform script that provisions a web service on AWS.
Now I want to reuse this script in different environments (production, stage, dev, ...).
Here's my folder structure:
.
├── main.tf
└── core_module
    ├── outputs.tf
   ├── my_service.tf
   └── variables.tf
main.tf contains something like the following:
module "prod-service" {
source = "./core_module"
env_specific_variable = "this is a production environment"
...
}
module "stage-service" {
source = "./core_module"
env_specific_variable = "this is a stage environment"
...
}
module "dev-service" {
source = "./core_module"
env_specific_variable = "this is a dev environment"
...
}
When I want to create the service on the production environment, I run
terraform apply -target=module.prod-service \
-var 'access_key=<prod_access_key>' \
-var 'secret_key=<prod_secret_key>' \
-var 'region=<prod_region>
And when I want to create the service on the stage environment, I run
terraform apply -target=module.stage-service \
-var 'access_key=<stage_access_key>' \
-var 'secret_key=<stage_secret_key>' \
-var 'region=<stage_region>
How can I force the users of this script to add the -target option so that he/she doesn't create all the environments in one single command?
Because different environments require different aws_access_key_id and aws_secret_access_key, running terraform apply to create all environments will create errors.
There are ways to force variables to be set or to be validated. I don't know of a way to enforce something in that target command line attribute. You'd have to use. variable instead and then change your main.tf to only deploy the environment specific version of the module based off some check if that variable.
However, it looks like you aren't using Terraform how it iis ntended. There is a concept called workspaces you should use to manage state of different environments instead of creating duplicate resources in one script for all the different environments.
To create a new workspace use something like
terraform workspace new development
terraform workspace new staging
The create a variables file for that environment
development.tfvars
variable env_specific_variable {
default = "this is a dev environment"
}
staging.tfvara
variable env_specific_variable {
default = "this is a staging environment"
}
Then in main.tf you deploy just one instance of your module and use the workspace specific environment instead of a literal string.
main.tf
module "my-service" {
source = "./core_module"
env_specific_variable = var.env_specific_variable
...
}
To deploy development use
terraform workspace select development
terraform apply
To deploy staging use
terraform workspace select staging
terraform apply
Note I am typing this on mobile and haven't ran this. There might be typos or syntax issues depending on your TF version.
Resources:
https://www.terraform.io/docs/state/workspaces.html
https://www.terraform.io/docs/configuration/variables.html
https://blog.gruntwork.io/how-to-manage-terraform-state-28f5697e68fa (good article on general, but specifically check out the section on isolating workspace)
This is not something that the -target argument is appropriate for. The Terraform documentation is explicit that the -target argument is for exceptional situations only, and gives some advice on what to do instead:
This targeting capability is provided for exceptional circumstances, such as recovering from mistakes or working around Terraform limitations. It is not recommended to use -target for routine operations, since this can lead to undetected configuration drift and confusion about how the true state of resources relates to configuration.
Instead of using -target as a means to operate on isolated portions of very large configurations, prefer instead to break large configurations into several smaller configurations that can each be independently applied. Data sources can be used to access information about resources created in other configurations, allowing a complex system architecture to be broken down into more manageable parts that can be updated independently.
In your case, I'd recommend to make each environment a separate Terraform configuration with its own separate Terraform state. Since you've factored out your environments into a shared module, each of them will contain only the module block to call it:
module "service" {
source = "../../modules/core"
env_specific_variable = "this is a production environment"
...
}
A common directory structure for this approach is:
environments/
prod/
main.tf
stage/
main.tf
dev/
main.tf
modules/
core/
variables.tf
service.tf
outputs.tf
To apply changes to a specific environment:
cd environments/prod
terraform init
terraform apply
By configuring Terraform like this, you avoid the need to force users to add extra options to their Terraform executions by using Terraform in the standard way. The environment configurations being separate has other benefits too, such as being able to use different provider versions in different environments in case you want to try rolling out a new version on your staging environment first before changing the production configuration.

Different environments for Terraform (Hashicorp)

I've been using Terraform to build my AWS stack and have been enjoying it. If it was to be used in a commercial setting the configuration would need to be reused for different environments (e.g. QA, STAGING, PROD).
How would I be able to achieve this? Would I need to create a wrapper script that makes calls to terraform's cli while passing in different state files per environment like below? I'm wondering if there's a more native solution provided by Terraform.
terraform apply -state=qa.tfstate
I suggest you take a look at the hashicorp best-practices repo, which has quite a nice setup for dealing with different environments (similar to what James Woolfenden suggested).
We're using a similar setup, and it works quite nicely. However, this best-practices repo assumes you're using Atlas, which we're not. We've created quite an elaborate Rakefile, which basically (going by the best-practices repo again) gets all the subfolders of /terraform/providers/aws, and exposes them as different builds using namespaces. So our rake -T output would list the following tasks:
us_east_1_prod:init
us_east_1_prod:plan
us_east_1_prod:apply
us_east_1_staging:init
us_east_1_staging:plan
us_east_1_staging:apply
This separation prevents changes which might be exclusive to dev to accidentally affect (or worse, destroy) something in prod, as it's a different state file. It also allows testing a change in dev/staging before actually applying it to prod.
Also, I recently stumbled upon this little write up, which basically shows what might happen if you keep everything together:
https://charity.wtf/2016/03/30/terraform-vpc-and-why-you-want-a-tfstate-file-per-env/
Paul's solution with modules is the right idea. However, I would strongly recommend against defining all of your environments (e.g. QA, staging, production) in the same Terraform file. If you do, then whenever you're making a change to staging, you risk accidentally breaking production too, which partially defeats the point of keeping those environments isolated in the first place! See Terraform, VPC, and why you want a tfstate file per env for a colorful discussion of what can go wrong.
I always recommend storing the Terraform code for each environment in a separate folder. In fact, you may even want to store the Terraform code for each "component" (e.g. a database, a VPC, a single app) in separate folders. Again, the reason is isolation: when making changes to a single app (which you might do 10 times per day), you don't want to put your entire VPC at risk (which you probably never change).
Therefore, my typical file layout looks something like this:
stage
└ vpc
└ main.tf
└ vars.tf
└ outputs.tf
└ app
└ db
prod
└ vpc
└ app
└ db
global
└ s3
└ iam
All the Terraform code for the staging environment goes into the stage folder, all the code for the prod environment goes into the prod folder, and all the code that lives outside of an environment (e.g. IAM users, S3 buckets) goes into the global folder.
For more info, check out How to manage Terraform state. For a deeper look at Terraform best practices, check out the book Terraform: Up & Running.
Please note that from version 0.10.0 now Terraform supports the concept of Workspaces (environments in 0.9.x).
A workspace is a named container for Terraform state. With multiple workspaces, a single directory of Terraform configuration can be used to manage multiple distinct sets of infrastructure resources.
See more info here: https://www.terraform.io/docs/state/workspaces.html
As you scale up your terraform usage, you will need to share state (between devs, build processes and different projects), support multiple environments and regions.
For this you need to use remote state.
Before you execute your terraform you need to set up your state.
(Im using powershell)
$environment="devtestexample"
$region ="eu-west-1"
$remote_state_bucket = "${environment}-terraform-state"
$bucket_key = "yoursharedobject.$region.tfstate"
aws s3 ls "s3://$remote_state_bucket"|out-null
if ($lastexitcode)
{
aws s3 mb "s3://$remote_state_bucket"
}
terraform remote config -backend S3 -backend-config="bucket=$remote_state_bucket" -backend-config="key=$bucket_key" -backend-config="region=$region"
#(see here: https://www.terraform.io/docs/commands/remote-config.html)
terraform apply -var='environment=$environment' -var='region=$region'
Your state is now stored in S3, by region, by environment, and you can then access this state in other tf projects.
There is absolutely no need for having separate codebases for dev and prod environments. Best practice dictates (I mean DRY) that actually you are better off having one code base and simply parametrize it as you would do normally when developing a software - you DONT have separate folders for development version of the application and for production version of the application. You only need to ensure right deployment scheme. The same goes with terraform. Consider this "Hello world" idea:
terraform-project
├── etc
│   ├── backend
│   │   ├── dev.conf
│   │   └── prod.conf
│   └── tfvars
│   ├── dev.tfvars
│   └── prod.tfvars
└── src
└── main.tf
contents of etc/backend/dev.conf
storage_account_name = "tfremotestates"
container_name = "tf-state.dev"
key = "terraform.tfstate"
access_key = "****"
contents of etc/backend/prod.conf
storage_account_name = "tfremotestates"
container_name = "tf-state.prod"
key = "terraform.tfstate"
access_key = "****"
contents of etc/tfvars/dev.tfvars
environment = "dev"
contents of etc/tfvars/prod.tfvars
environment = "prod"
contents of src/main.tf
terraform {
backend "azurerm" {
}
}
provider "azurerm" {
version = "~> 2.56.0"
features {}
}
resource "azurerm_resource_group" "rg" {
name = "rg-${var.environment}"
location = "us-east"
}
Now you only have to pass apropriate value to cli invocation, eg.:
export ENVIRONMENT=dev
terraform init -backend-config=etc/backends/${ENVIRONMENT}.conf
terraform apply -vars-file=etc/tfvars/${ENVIRONMENT}.tfvars
This way:
we have separate state files for each environment (so they can even be deployed in different subscriptions/accounts)
we have the same code base, so we are sure the differences between dev and prod are small and we can rely on dev for testing purposes before going live
we follow DRY directive
we follow KISS directive
no need to use obscure "workspaces" interface!
Of course in order for this to be fully secure you should incorporate some kind of git flow and code review, perhaps some static or integration testing, automatic deployment process, etc. etc.. But I consider this solution as the best approach to having multiple terraform environments without duplicating code and it worked for us very nicely for a couple of years now.
No need to make a wrapper script. What we do is split our env into a module and then have a top level terraform file where we just import that module for each environment. As long as you have your module setup to take enough variables, generally env_name and a few others, you're good. As an example
# project/main.tf
module "dev" {
source "./env"
env = "dev"
aws_ssh_keyname = "dev_ssh"
}
module "stage" {
source "./env"
env = "stage"
aws_ssh_keyname = "stage_ssh"
}
# Then in project/env/main.tf
# All the resources would be defined in here
# along with variables for env and aws_ssh_keyname, etc.
Edit 2020/03/01
This answer is pretty old at this point, but it's worth updating. The critique that dev and stage sharing the same state file being bad is a matter of perspective. For the exact code provided above it's completely valid because dev and stage are sharing the same code as well. Thus "breaking dev will wreck your stage," is correct. The critical thing that I didn't note when writing this answer was the source "./env" could also be written as source "git::https://example.com/network.git//modules/vpc?ref=v1.2.0"
Doing that makes your entire repo become something of a submodule to the TF scripts allowing you to split out one branch as your QA branch and then tagged references as your Production envs. That obviates the problem of wrecking your staging env with a change to dev.
Next state file sharing. I say that's a matter of perspective because with one single run it's possible to update all your environments. In a small company that time savings when promoting changes can be useful, some trickery with --target is usually enough to speed up the process if you're careful, if that's even really needed. We found it less error prone to manage everything from one place and one terraform run, rather than having multiple different configurations possibly being applied slightly differently across the environments. Having them all in one state file forced us to be more disciplined about what really needed to be a variable v.s. what was just overkill for our purposes. It also very strongly prevented us from allowing our environments to drift too far apart from each other. When you terraform plan outputs show 2k lines, and the differences are mainly because dev and stage look nothing like prod the frustration factor alone encouraged our team to bring that back to sanity.
A very strong counter argument to that is if you're in a large company where various compliance rules prevent you from touching dev / stage / prod at the same time. In that scenario it's better to split up your state files, just make sure that how you're running terraform apply is scripted. Otherwise you run the very real risk of those state files drifting apart when someone says "Oh I just need to --target just this one thing in staging. We'll fix it next sprint, promise." I've seen that spiral quickly multiple times now, making any kind of comparison between the environments questionable at best.
Form terraform version 0.10+ There is a way to maintain the state file using Workspace command
$ terraform workspace list // The command will list all existing workspaces
$ terraform workspace new <workspace_name> // The command will create a workspace
$ terraform workspace select <workspace_name> // The command select a workspace
$ terraform workspace delete <workspace_name> // The command delete a workspace
First thing you need to do is to create each workspace for your environment
$ terraform workspace new dev
Created and switched to workspace "dev"!
You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
$terraform workspace new test
Created and switched to workspace "test"!
You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
$terraform workspace new stage
Created and switched to workspace "stage"!
You're now on a new, empty workspace. Workspaces isolate their state,
so if you run "terraform plan" Terraform will not see any existing state
for this configuration.
backend terraform.tfstate.d directory will be created
under them you can see 3 directory - dev, test,stage and each will maintain its state file under its workspace.
all that now you need to do is to move the env-variable files in another folder
keep only one variable file for each execution of terraform plan , terraform apply
main.tf
dev_variable.tfvar
output.tf
Remember to switch to right workspace to use the correct environment state file
$terraform workspace select test
main.tf
test_variable.tfvar
output.tf
Ref : https://dzone.com/articles/manage-multiple-environments-with-terraform-worksp
There are plenty of good answers in this thread. Let me contribute as well with an idea that worked for me and some other teams.
The idea is to have a single "umbrella" project that contains the whole infrastructure code.
Each environment's terraform file includes just a single module - the "main".
Then "main" will include resources and other modules
- terraform_project
- env
- dev01 <-- Terraform home, run from here
- .terraform <-- git ignored of course
- dev01.tf <-- backend, env config, includes always _only_ the main module
- dev02
- .terraform
- dev02.tf
- stg01
- .terraform
- stg01.tf
- prd01
- .terraform
- prd01.tf
- main <-- main umbrella module
- main.tf
- variables.tf
- modules <-- submodules of the main module
- module_a
- module_b
- module_c
And a sample environment home file (e.g. dev01.tf) will look like this
provider "azurerm" {
version = "~>1.42.0"
}
terraform {
backend "azurerm" {
resource_group_name = "tradelens-host-rg"
storage_account_name = "stterraformstate001"
container_name = "terraformstate"
key = "dev.terraform.terraformstate"
}
}
module "main" {
source = "../../main"
subscription_id = "000000000-0000-0000-0000-00000000000"
project_name = "tlens"
environment_name = "dev"
resource_group_name = "tradelens-main-dev"
tenant_id = "790fd69f-41a3-4b51-8a42-685767c7d8zz"
location = "West Europe"
developers_object_id = "58968a05-dc52-4b69-a7df-ff99f01e12zz"
terraform_sp_app_id = "8afb2166-9168-4919-ba27-6f3f9dfad3ff"
kubernetes_version = "1.14.8"
kuberenetes_vm_size = "Standard_B2ms"
kuberenetes_nodes_count = 4
enable_ddos_protection = false
enable_waf = false
}
Thanks to that you:
Can have separate backends for Terraform remote state-files per environment
Be able to use separate system accounts for different environments
Be able to use different versions of providers and Terraform itself per environment (and upgrade one by one)
Ensure that all required properties are provided per environment (Terraform validate won't pass if an environmental property is missing)
Ensure that all resources/modules are always added to all environments. It is not possible to "forget" about a whole module because there is just one.
Check a source blog post
Hashicorp recommends different statefiles and folders like this:
├── assets
│ ├── index.html
├── prod
│ ├── main.tf
│ ├── variables.tf
│ ├── terraform.tfstate
│ └── terraform.tfvars
└── dev
├── main.tf
├── variables.tf
├── terraform.tfstate
└── terraform.tfvars
There's even documentation on how to refactor monolithic configuration to support multiple environments according to their best practice. Check it out here:
https://learn.hashicorp.com/tutorials/terraform/organize-configuration#variables-tf
Example Run multiple examples
├── env_vars
│   └── qa.tfvars
├── main.tf
├── outputs.tf
├── terraform.tfstate
├── terraform.tfstate.backup
├── _tools
│   └── apply.sh
└── variables.tf
You can run
#!/bin/bash
echo "Enter your environment (qa,dev,stage or prod)"
read environment
rm -Rf .terraform/
terraform init -var-file=env_vars/$environment.tfvars #-backend-config="key=$environment/$environment.tf" -backend-config="bucket=<bucket_name>"
terraform apply -var-file=env_vars/$environment.tfvars

Resources