Terraform addresses re-use of components via Modules. So I might put the definition for an AWS Autoscale Group in a module and then have multiple top level resource files use that ASG. Fine so far.
My question is: how to use Terraform to group and organize multiple top level resource files? In other words, what is the next level of organization?
We have a system that has multiple applications...each application would correspond to a TF resource file and those resource files would use the modules. We have different customers that use different sets of the applications so we need to keep them in their own resource files.
We're asking if there is a TF concept for deploying multiple top level resource files (applications to us).
At some point, you can't or it doesn't make sense to abstract any further. You will always have a top level resource file (i.e main.tf) describing modules to use. You can organize these top level resource files via:
Use Terraform Workspaces
You can use workspaces - in your case, maybe one per client name. Workspaces each have their own backing Terraform state. Then you can use terraform.workspace variable in your Terraform code. Workspaces can also be used to target different environments.
Use Separate Statefiles
Have one top level statefile for each of your clients, i.e clienta.main.tf, clientb.main.tf, etc. You could have them all in the same repository and use a script to run them all, individually, or in whatever pattern you prefer; or you could have one repository per client.
You can also combine workspaces with your separate statefiles to target individual environments, i.e staging,production, for each client. The Terraform docs go into more detail about workspaces and some of their downsides.
Related
I am a bit confused on how complex terraform folder structure would be managed in a single terraform state file.
Assuming I have the following structure:
tf-structure
modules folder is a reusable code.
backend-app is not a module, but an actual resources which describe my backend "stuff".
frontend-app is not a module, but an actual resources which describe my frontend "stuff".
root-infra - let's assume I have additional folder called "root-infra" which is running all my VPC/gateway/network and some common infra stuff.
I can't understand how everything would be triggered to run in a single state file?
for example, if I add some resources in my backend-app, I would run plan/apply from backend-app folder, but this will result all my common infra + frontend be deleted.
So I'm assuming that even if I make a change in my backend-app folder, I still need to run plan/apply from my root-infra folder, assuming that the main.tf there include the backend-app (and also the frontend, etc.).
Am I right?
if so, how would I import my backend/frontend folders into my root-infra main.tf? and why is backend/frontend are any different from a regular module?
From the structure you provided, it seems you should actually be having multiple state files - one for root-infra, one for backend-app and one for frontend-app.
Each of those directories' state file will then manage the resources located in them. Using one single state file like you mentioned (Assuming you're using a remote state here, as local state files would already solve that problem), means that when you run it in the root-infra, terraform 'thinks' that these are the only resources you're deploying.
Next, when you move to backend-app and try to deploy from there, but with the same state file used in root-infra, terraform doesn't see the root-infra resources anymore in this directory, but instead sees new backend-app resources. It will attempt to delete the root-infra resources and replace them with backend-app etc. The same thing will happen later when you're deploying frontend-app.
The only solution here is to have different state files managing unique stack of resources. root-infra, backend-infra and frontend-infra are each one stack which should be managed individually.
If you wanted to manage all of them from one single state file, your structure should change and the entire thing should be one or two stack max. One for infra, one for applications. As if you were deploying all resources from one single directory instead, and you could just identify the different apps individually by having different tf files in the same directory. E.g.:
tf/modules/network/dns.tf
tf/modules/network/output.tf
tf/modules/network/variables.tf
tf/infra/main_infra.tf
tf/infra/vars_infra.tf
tf/infra/infra_remote_state.tf
tf/apps/main_frontend.tf
tf/apps/main_backend.tf
tf/apps/apps_remote_state.tf
I am migrating some manually provisioned infastructure over to terraform.
Currently I am manually defining the terraform resource's in .tf files, importing the remote state with terraform import. I then run terraform plan multiple times, each time modifying the local tf files until the match the existing infastructure.
How can I speed this process up by downloading the remote state directly into a .tf resource file?
The mapping from the configuration as written in .tf files to real infrastructure that's indirectly represented in state snapshots is a lossy one.
A typical Terraform configuration has arguments of one resource derived from attributes of another, uses count or for_each to systematically declare multiple similar objects, and might use Terraform modules to decompose the problem and reuse certain components.
All of that context is lost in the mapping to real remote objects, and so there is no way to recover it and generate idiomatic .tf files that would be ready to use. Therefore you would always need to make some modifications to the configuration in order to produce a useful Terraform configuration.
While keeping that caveat in mind, you can review the settings for objects you've added using terraform import by running the terraform show command. Its output is intended to be read by humans rather than machines, but it does present the information using a Terraform-language-like formatting, and so what it produces can potentially be a starting point for a Terraform configuration, with the caveat that it won't always be totally valid and will typically need at least some adjustments in order to be accepted by terraform plan as valid, and to be useful for ongoing use.
Enviornment Isolation: Dirs v. Workspaces v. Modules
The Terraform docs Separate Development and Production Environments seem to take two major approaches for handling a "dev/test/stage" type of CI enviornment, i.e.
Directory seperation - Seems messy especially when you potentially have multiple repos
Workspaces + Different Var Files
Except when you lookup workspaces it seems to imply workspaces are NOT a correct solution for isolating enviornments.
In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments.
I would also like to consider using remote state -- e.g. azurerm backend
Best Practice Questions
When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
e.g.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "${var.enviornment}.terraform.tfstate"
}
}
vs.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "cache_cluster_${var.enviornment}.terraform.tfstate"
}
}
When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
A re-usable module for your infrastructure would essentially encapsulate the part of your infrastructure that is common to all your "dev/test/stage" environments. So no, you wouldn't have any "dev/test/stage" folders in there.
If, for example, you have an infrastructure that consists of a Kubernetes cluster and a MySQL database, you could have two modules - a 'compute' module that handles the k8s cluster, and a 'storage' module that would handle the DB. These modules go into a /modules subfolder. Your root module (main.tf file in the root of your repo) would then instantiate these modules and pass the appropriate input variables to customize them for each of the "dev/test/stage" environments.
Normally it would be a bit more complex:
Any shared VPC or firewall config might go into a networking module.
Any service accounts that you might automatically create might go into a credentials or iam module.
Any DNS mappings for API endpoints might go into a dns module.
You can then easily pass in variables to customize the behavior for "dev/test/stage" as needed.
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
Going off the Terraform docs and their recommended separation:
In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls.
You would not share tfstorageaccount. Now take this with a grain of salt and determine your own needs - essentially what you need to take into account is the security and data integrity implications of sharing backends/credentials. For example:
How sensitive is your state? If you have sensitive variables being output to your state, then you might not want your "production" state sitting in the same security perimeter as your "test" state.
Will you ever need to wipe your state or perform destructive actions? If, for example, your state storage provider only versions by folder, then you probably don't want your "dev/test/stage" states sitting next to each other.
I am using orchestration tools to automate Terraform deployments using the opensource version. I would like to know more on the workspace options that are available.
More specifically, what happens when two developers execute a Terraform deployment in two different workspaces at the same time using the same executable? The scope of the templates is within the directory however what will be the scope of the workspaces? Is the scope tied to a single Terraform executable or is it also directory driven?
Any help is much appreciated!
By default, it's also directory driven. But you can configure different backend types (AWS S3, PostgreSQL, etc).
If you use AWS S3, each workspace will represent different files inside the bucket. If you use PostgreSQL, each workspace will represent a different row in the states table.
If two developers execute terraform apply for two different workspaces you won't have any problem, but you'll need State locking to avoid two executions for the same workspace at the same time because this can potentially corrupt your state.
If one has two AWS accounts, one for development and one for live (for example) I am aware that one can use terraform workspaces to manage the state of each environment.
However, if i switch workspace from "dev" to "live" is there a way to tell terraform it should now be applying the state to the live account rather than the test one?
One way I thought of, which is error prone, would be swap my secret.auto.tfvars file each time i switch workspace since I presume when running with a different access key (the one belonging to the "live" account) the AWS provider will then be applying to that account. However, it'd be very easy to swap workspace and have the wrong credentials present which would run the changes against the wrong environment.
I'm looking for a way to almost link a workspace with an account id in AWS.
I did find this https://github.com/hashicorp/terraform/issues/13700 but it refers to the deprecated env command, this comment looked somewhat promising in particular
Update
I have found some information on GitHub where I left this comment as a reply to an earlier comment which recommended considering modules instead of workspaces and actually indicates that workspaces aren't well suited to this task. If anyone can provide information on how modules could be used to solve this issue of maintaining multiple versions of the "same" infrastructure concurrently I'd be keen to see how this improves upon the workspace concept.
Here's how you could use Terraform modules to structure your live vs dev environments that point to different AWS accounts, but the environments both have/use the same Terraform code.
This is one (of many) ways that you could structure your dirs; you could even put the modules into their own Git repo, but I'm going to try not to confuse things too much. In this example, you have a simple app that has 1 EC2 instance and 1 RDS database. You write whatever Terraform code you need in the modules/*/ subdirs, making sure to parameterize whatever attributes are different across environments.
Then in your dev/ and live/ dirs, main.tf should be the same, while provider.tf and terraform.tfvars reflect environment-specific info. main.tf would call the modules and pass in the env-specific params.
modules/
|-- ec2_instance/
|-- rds_db/
dev/
|-- main.tf # --> uses the 2 modules
|-- provider.tf # --> has info about dev AWS account
|-- terraform.tfvars # --> has dev-specific values
live/
|-- main.tf # --> uses the 2 modules
|-- provider.tf # --> has info about live/prod AWS account
|-- terraform.tfvars # --> has prod-specific values
When you need to plan/apply either env, you drop into the appropriate dir and run your TF commands there.
As for why this is preferred over using Terraform Workspaces, the TF docs explains it well:
In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments.
BTW> Terraform merely changed the env subcommand to workspace when they decided that 'env' was a bit too confusing.
Hope this helps!
Terraform workspaces hold the state information. They connect to user accounts based on the way in which AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are used in the environment. In order to link any given workspace to an AWS account they'd have to store user credentials in some kind of way, which they understandably don't. Therefore, I would not expect to see workspaces ever directly support that.
But, to almost link a workspace to an account you just need to automatically switch AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY each time the workspace is switched. You can do this by writing a wrapper around terraform which:
Passes all commands on to the real terraform unless it finds
workspace select in the command line.
Upon finding workspace select in the command line it parses out the
workspace name.
Export the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for the
AWS account which you want to link to the workspace
Finish by passing on the workspace command to the real terraform
This would load in the correct credentials each time
terraform workspace select <WORKSPACE>
was used