How to have multiple providers dependent upon environment? - terraform

I have two different AWS configurations. On a dev laptop, the developer uses a mfa-secured profile inside a shared_credentials_file.
On jenkins, we export environment variables and then assume a role.
This means that the provider blocks look really different. At the root level, they share the same backend.tf.
I know I can have two different roots with different providers, but is there a way so I don't have to duplicate backend.tf and other root files?

I understood your point, but it is not recommended. Make aws configuration with system environment variables ready before you run terraform commands.
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN (optional)
AWS_DEFAULT_REGION
AWS_DEFAULT_PROFILE (optional)
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html
https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html

The solution that I think makes most sense is to place local developers and jenkins automations into two separate environment directories, each with their own aws.tf and backend workspace.
This made sense because developers should not be messing with resources created by automation, and any operations to the jenkins backend should be done by jenkins, otherwise devs could overwrite resources that jenkins put up, and vice versa.

Related

How would one create an isolated jenkins build node (without access to secrets)?

As the Top 10 CI/CD Security Risks SEC-04 states:
Ensure that pipelines running unreviewed code are executed on isolated nodes, not exposed to secrets and sensitive environments.
The above statement seems especially true when the code (or pipeline code itself) is in a pull request which has not yet been seen/approved/merged but from a developer perspective you want to know if it builds successfully in the first place. Running code that nobody has laid eyes upon while having access to build secrets is definitely a security risk.
Wondering if isolation is achievable with Jenkins build nodes as I cannot find any specific options for this.
My assumption is that dynamic provisioned containerized agents are best suited for isolated environments, I'm just not sure how to prevent their access to secrets from the Jenkins controller.

Terraform Best Practice Multi-Environment, Modules, and State

Enviornment Isolation: Dirs v. Workspaces v. Modules
The Terraform docs Separate Development and Production Environments seem to take two major approaches for handling a "dev/test/stage" type of CI enviornment, i.e.
Directory seperation - Seems messy especially when you potentially have multiple repos
Workspaces + Different Var Files
Except when you lookup workspaces it seems to imply workspaces are NOT a correct solution for isolating enviornments.
In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments.
I would also like to consider using remote state -- e.g. azurerm backend
Best Practice Questions
When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
e.g.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "${var.enviornment}.terraform.tfstate"
}
}
vs.
terraform {
backend "azurerm" {
storage_account_name = "tfstorageaccount"
container_name = "tfstate"
key = "cache_cluster_${var.enviornment}.terraform.tfstate"
}
}
When the docs refer to using a "re-usable" module, what would this look like if say I had na existing configuration folder? Would I still need to create a sepreate folder for dev/test/stage?
A re-usable module for your infrastructure would essentially encapsulate the part of your infrastructure that is common to all your "dev/test/stage" environments. So no, you wouldn't have any "dev/test/stage" folders in there.
If, for example, you have an infrastructure that consists of a Kubernetes cluster and a MySQL database, you could have two modules - a 'compute' module that handles the k8s cluster, and a 'storage' module that would handle the DB. These modules go into a /modules subfolder. Your root module (main.tf file in the root of your repo) would then instantiate these modules and pass the appropriate input variables to customize them for each of the "dev/test/stage" environments.
Normally it would be a bit more complex:
Any shared VPC or firewall config might go into a networking module.
Any service accounts that you might automatically create might go into a credentials or iam module.
Any DNS mappings for API endpoints might go into a dns module.
You can then easily pass in variables to customize the behavior for "dev/test/stage" as needed.
When using remote backends, should the state file be shared across repos by default or separated by repo and enviornment?
Going off the Terraform docs and their recommended separation:
In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls.
You would not share tfstorageaccount. Now take this with a grain of salt and determine your own needs - essentially what you need to take into account is the security and data integrity implications of sharing backends/credentials. For example:
How sensitive is your state? If you have sensitive variables being output to your state, then you might not want your "production" state sitting in the same security perimeter as your "test" state.
Will you ever need to wipe your state or perform destructive actions? If, for example, your state storage provider only versions by folder, then you probably don't want your "dev/test/stage" states sitting next to each other.

Is there any way to run a script inside alredy existing infrastructre using terraform

I want to run a script using terraform inside an existing instance on any cloud which is pre-created .The instance was created manually , is there any way to push my script to this instance and run it using terraform ?
if yes ,then How can i connect to the instance using terraform and push my script and run it ?
I believe ansible is a better option to achieve this easily.
Refer the example give here -
https://docs.ansible.com/ansible/latest/modules/script_module.html
Create a .tf file and describe your already existing resource (e.g. VM) there
Import existing thing using terraform import
If this is a VM then add your script to remote machine using file provisioner and run it using remote-exec - both steps are described in Terraform file, no manual changes needed
Run terraform plan to see if expected changes are ok, then terraform apply if plan was fine
Terraform's core mission is to create, update, and destroy long-lived infrastructure objects. It is not generally concerned with the software running in the compute instances it deploys. Instead, it generally expects each object it is deploying to behave as a sort of specialized "appliance", either by being a managed service provided by your cloud vendor or because you've prepared your own machine image outside of Terraform that is designed to launch the relevant workload immediately when the system boots. Terraform then just provides the system with any configuration information required to find and interact with the surrounding infrastructure.
A less-ideal way to work with Terraform is to use its provisioners feature to do late customization of an image just after it's created, but that's considered to be a last resort because Terraform's lifecycle is not designed to include strong support for such a workflow, and it will tend to require a lot more coupling between your main system and its orchestration layer.
Terraform has no mechanism intended for pushing arbitrary files into existing virtual machines. If your virtual machines need ongoing configuration maintenence after they've been created (by Terraform or otherwise) then that's a use-case for traditional configuration management software such as Ansible, Chef, Puppet, etc, rather than for Terraform.

Is it possible to link a terraform workspace to an AWS account

If one has two AWS accounts, one for development and one for live (for example) I am aware that one can use terraform workspaces to manage the state of each environment.
However, if i switch workspace from "dev" to "live" is there a way to tell terraform it should now be applying the state to the live account rather than the test one?
One way I thought of, which is error prone, would be swap my secret.auto.tfvars file each time i switch workspace since I presume when running with a different access key (the one belonging to the "live" account) the AWS provider will then be applying to that account. However, it'd be very easy to swap workspace and have the wrong credentials present which would run the changes against the wrong environment.
I'm looking for a way to almost link a workspace with an account id in AWS.
I did find this https://github.com/hashicorp/terraform/issues/13700 but it refers to the deprecated env command, this comment looked somewhat promising in particular
Update
I have found some information on GitHub where I left this comment as a reply to an earlier comment which recommended considering modules instead of workspaces and actually indicates that workspaces aren't well suited to this task. If anyone can provide information on how modules could be used to solve this issue of maintaining multiple versions of the "same" infrastructure concurrently I'd be keen to see how this improves upon the workspace concept.
Here's how you could use Terraform modules to structure your live vs dev environments that point to different AWS accounts, but the environments both have/use the same Terraform code.
This is one (of many) ways that you could structure your dirs; you could even put the modules into their own Git repo, but I'm going to try not to confuse things too much. In this example, you have a simple app that has 1 EC2 instance and 1 RDS database. You write whatever Terraform code you need in the modules/*/ subdirs, making sure to parameterize whatever attributes are different across environments.
Then in your dev/ and live/ dirs, main.tf should be the same, while provider.tf and terraform.tfvars reflect environment-specific info. main.tf would call the modules and pass in the env-specific params.
modules/
|-- ec2_instance/
|-- rds_db/
dev/
|-- main.tf # --> uses the 2 modules
|-- provider.tf # --> has info about dev AWS account
|-- terraform.tfvars # --> has dev-specific values
live/
|-- main.tf # --> uses the 2 modules
|-- provider.tf # --> has info about live/prod AWS account
|-- terraform.tfvars # --> has prod-specific values
When you need to plan/apply either env, you drop into the appropriate dir and run your TF commands there.
As for why this is preferred over using Terraform Workspaces, the TF docs explains it well:
In particular, organizations commonly want to create a strong separation between multiple deployments of the same infrastructure serving different development stages (e.g. staging vs. production) or different internal teams. In this case, the backend used for each deployment often belongs to that deployment, with different credentials and access controls. Named workspaces are not a suitable isolation mechanism for this scenario.
Instead, use one or more re-usable modules to represent the common elements, and then represent each instance as a separate configuration that instantiates those common elements in the context of a different backend. In that case, the root module of each configuration will consist only of a backend configuration and a small number of module blocks whose arguments describe any small differences between the deployments.
BTW> Terraform merely changed the env subcommand to workspace when they decided that 'env' was a bit too confusing.
Hope this helps!
Terraform workspaces hold the state information. They connect to user accounts based on the way in which AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are used in the environment. In order to link any given workspace to an AWS account they'd have to store user credentials in some kind of way, which they understandably don't. Therefore, I would not expect to see workspaces ever directly support that.
But, to almost link a workspace to an account you just need to automatically switch AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY each time the workspace is switched. You can do this by writing a wrapper around terraform which:
Passes all commands on to the real terraform unless it finds
workspace select in the command line.
Upon finding workspace select in the command line it parses out the
workspace name.
Export the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY for the
AWS account which you want to link to the workspace
Finish by passing on the workspace command to the real terraform
This would load in the correct credentials each time
terraform workspace select <WORKSPACE>
was used

should I configure my EC2 using user_data or Ansible

When launching EC2 using Terraform (or cloud formation), we can configure EC2 by putting some scripts in user_data/remote-exec. Alternatively, we can configure EC2 using Ansible/Chef, etc. What are the difference of configuring EC2 in user_data/remote-exec and do that with Ansible/Chef? when to use the former, when to use the latter (I know Ansible/Chef is idempotent)?
In my case, the EC2 is originally manually launched, then manually configured using a lot of linux commands. and the commands are not configured by me. Now I am the person to automate the whole structure using terraform, and configure EC2s. Using user_data/remote-exec to configure EC2 is straightforward. I just need to put all the existing linux commands they have in some scripts with a little change. And if the configuration result using my script is not successful, at least I can quickly figure out whether I miss some commands by comparing my script and the original linux commands. But if I use ansible/chef, I have to rewrite all the steps using different language. And if the configuration is not what expected, it is hard for me to figure out which steps are not correct, because the syntax of ansible/chef and linux commands are totally different.
My question is, in my case, should I use ansible/chef or user_data/remote-exec for configuration?
User Data is good for initial configuration of the system. If you need longer term maintenance a configuration management software like Ansible/Chef/Salt/Puppet is a great option.
Packer can be used for immutable infrastructure, i.e. doesn't change after creation. You can run all the scripts and installs on the system for it to be ready to just boot, this is also faster because you don't have to wait for user data to run.
A few questions you have to ask as well, how often are you going to patch these? Are you going to just update existing or replace with new. Ansible is great for configuration since it's just yaml files an
Blue/Green deployments generally replace servers with all new ones and gradually move traffic over to the new servers.
Some more things to consider with your Infrastructure as code

Resources