How to manage locally generated stateful files in Terraform - terraform

I have a Terraform (1.0+) script that generates a local config file from a template based on some inputs, e.g:
locals {
config_tpl = templatefile("${path.module}/config.tpl", {
foo = "bar"
})
}
resource "local_file" "config" {
content = local._config_tpl
filename = "${path.module}/config.yaml"
}
This file is then used by a subsequent command run from a local-exec block, which in turn also generates local config files:
resource "null_resource" "my_command" {
provisioner "local-exec" {
when = create
command = "../scripts/my_command.sh"
working_dir = "${path.module}"
}
depends_on = [
local_file.config,
]
}
my_command.sh generates infrastructure for which there is no Terraform provider currently available.
All of the generated files should form part of the configuration state, as they are required later during upgrades and ultimately to destroy the environment.
I also would like to run these scripts from a CI/CD pipeline, so naturally you would expect the workspace to be clean on each run, which means the generated files won't be present.
Is there a pattern for managing files such as these? My initial though is to create cloud storage bucket, zip the files up, and store them there before pulling them back down whenever they're needed. However, this feels even more dirty than what is already happening, and it seems like there is the possibility to run into dependency issues.
Or, am I missing something completely different to solve issues such as this?

The problem you've encountered here is what the warning in the hashicorp/local provider's documentation is discussing:
Terraform primarily deals with remote resources which are able to outlive a single Terraform run, and so local resources can sometimes violate its assumptions. The resources here are best used with care, since depending on local state can make it hard to apply the same Terraform configuration on many different local systems where the local resources may not be universally available. See specific notes in each resource for more information.
The short and unfortunate answer is that what you are trying to do here is not a problem Terraform is designed to address: its purpose is to manage long-lived objects in remote systems, not artifacts on your local workstation where you are running Terraform.
In the case of your config.yaml file you may find it a suitable alternative to use a cloud storage object resource type instead of local_file, so that Terraform will just write the file directly to that remote storage and not affect the local system at all. Of course, that will help only if whatever you intend to have read this file is also able to read from the same cloud storage, or if you can write a separate glue script to fetch the object after terraform apply is finished.
There is no straightforward path to treating the result of a provisioner as persistent data in the state. If you use provisioners then they are always, by definition, one-shot actions taken only during creation of a resource.

Related

Terraform provisioner trigger only for new instances / only run once

I have conditional provision steps I want to run for all compute instances created, but only run once.
I know I can put the provisioning within the compute resource, but then it cannot be conditional.
If I put it in a null_resource, I need a trigger, and I don't know how to trigger on only the newly created resources (i.e. if I already have 1 instance, and want to scale to 2, I want to only run provisioning on the 2nd being created, not run again on the 1st which is already provisioned).
How can I get a variable that only gives me the id or ip of the instance just created, as opposed to all of them?
Below an example of the provisioner.
resource "null_resource" "provisioning" {
count = var.condition ? length(var.instance_ips) : 0
triggers = {
instance_ids = join(",", var.instance_ips)
}
connection {
agent = false
timeout = "4m"
host = var.instance_ips[count.index]
user = "user"
private_key = var.ssh_private_key
}
provisioner "remote-exec" {
inline = [ do something, then remove the public key from authorized_keys ]
}
}
PS: the reason I only can run once (as opposed to run again and do nothing if already provisioned) is that I want to destroy the provisioning public key after I'm done, since it is using a tf generated key pair and the private key ends up in the state file, I want to make sure someone who gets access to the key pair still cannot access the instance.
Once the public key is removed from the authorized_keys the provisioner running a second time will just fail to connect, timeout and fail.
I found that I can use the on_failure: continue key, but then if it actual fails for legitimate reasons it would continue too.
I also could use a key pair that is generated locally with a local-exec provisioner so it doesn't show in the state file, but then the key is a file, which is not much different if someone get access to it; the file needs to stay on the machine, which may not work well with a cloud resource manager env that is recreated on a need to run basis.
And then I'm sure there are other ways to provision a file or script, but in this case it contains instance dependency data generated by TF, that I don't want left in a cloud-init.
So, I come down to needing to figure a way to use a trigger that only contains the new instance(s)
Any ideas how to do this?
https://www.terraform.io/docs/provisioners/
This documentation lists provisioners as a last resource and provides some suggestions on how to avoid having to use it, for various common resources.
Execute the script from the user_data, which is specifically designed for provisional, run-once actions. Since defining the user_data supports all regular Terraform interpolation, you can use that opportunity to pass environment variables or selectively include/exclude parts of a script, if you need conditional logic.
The downside is that any change in user_data results in recreating the instances, or creating a new launch configuration/template.

terraform interpolation with variables returning error [duplicate]

# Using a single workspace:
terraform {
backend "remote" {
hostname = "app.terraform.io"
organization = "company"
workspaces {
name = "my-app-prod"
}
}
}
For Terraform remote backend, would there be a way to use variable to specify the organization / workspace name instead of the hardcoded values there?
The Terraform documentation
didn't seem to mention anything related either.
The backend configuration documentation goes into this in some detail. The main point to note is this:
Only one backend may be specified and the configuration may not contain interpolations. Terraform will validate this.
If you want to make this easily configurable then you can use partial configuration for the static parts (eg the type of backend such as S3) and then provide config at run time interactively, via environment variables or via command line flags.
I personally wrap Terraform actions in a small shell script that runs terraform init with command line flags that uses an appropriate S3 bucket (eg a different one for each project and AWS account) and makes sure the state file location matches the path to the directory I am working on.
I had the same problems and was very disappointed with the need of additional init/wrapper scripts. Some time ago I started to use Terragrunt.
It's worth taking a look at Terragrunt because it closes the gap between Terraform and the lack of using variables at some points, e.g. for the remote backend configuration:
https://terragrunt.gruntwork.io/docs/getting-started/quick-start/#keep-your-backend-configuration-dry

Use variable in Terraform remote backend

# Using a single workspace:
terraform {
backend "remote" {
hostname = "app.terraform.io"
organization = "company"
workspaces {
name = "my-app-prod"
}
}
}
For Terraform remote backend, would there be a way to use variable to specify the organization / workspace name instead of the hardcoded values there?
The Terraform documentation
didn't seem to mention anything related either.
The backend configuration documentation goes into this in some detail. The main point to note is this:
Only one backend may be specified and the configuration may not contain interpolations. Terraform will validate this.
If you want to make this easily configurable then you can use partial configuration for the static parts (eg the type of backend such as S3) and then provide config at run time interactively, via environment variables or via command line flags.
I personally wrap Terraform actions in a small shell script that runs terraform init with command line flags that uses an appropriate S3 bucket (eg a different one for each project and AWS account) and makes sure the state file location matches the path to the directory I am working on.
I had the same problems and was very disappointed with the need of additional init/wrapper scripts. Some time ago I started to use Terragrunt.
It's worth taking a look at Terragrunt because it closes the gap between Terraform and the lack of using variables at some points, e.g. for the remote backend configuration:
https://terragrunt.gruntwork.io/docs/getting-started/quick-start/#keep-your-backend-configuration-dry

understand the terraform for OCI

I have 3 questions here:
I have created terraform form scripts in Oracle Cloud Infrastructure to build the instance and other resources. But I am not able to get any script for route table configuration and service in network script. So i have made them manual. my current table has only the resource name, rest all configuration is blank. So i need help in getting a properly supported script for OCI to create route table with configuration.
As i did such things manually, i am not able to give terraform apply after doing some changes in the script, as terraform apply will delete all the rules which i created manually. So is it mandatory to give terraform apply every time when i change the script? or can i enter the config manually and simultaneously match that in the terraform script so that everything is intact?
After every terraform changes i could see 2 files is getting enlarge (terraform.tfstate, terraform.tfstate.backup) what are these two files? if that is a backup file, then how will it help me to restore if i mess up in my configuration?
In Terraform, the configuration script is always the source of truth. When you apply a configuration; Terraform will favor the settings of that configuration and override any changes that were manually done outside of Terraform.
To make sure your manual changes are not overwritten, you should make sure the configuration always matches the manual changes. One way to import manual resources into your configuration is using "terraform import" (see https://www.terraform.io/docs/import/index.html).
The terraform.tfstate and terraform.tfstate.backup files are used by Terraform to keep track of the latest state of the resources that Terraform has created. These files are used to help Terraform determine whether you configuration script has drifted from the state; so it knows how to apply your configuration script. To my knowledge, these state files are not intended to be backup files if you mess up your configuration. (see https://www.terraform.io/docs/state/index.html)
Hope this helps.
Here is an example for a route table resource in Terraform config file:
resource "oci_core_route_table" "webserver-rt" {
compartment_id = "${var.compartment_ocid}"
vcn_id = "${oci_core_virtual_network.oci-vcn.id}"
display_name = "webserver-rt"
route_rules = [{
destination = "0.0.0.0/0"
network_entity_id = "${oci_core_internet_gateway.internet-gateway.id}"
}]
}
You may find more details here: https://github.com/terraform-providers/terraform-provider-oci/blob/master/docs/examples/networking/route_table/route_table.tf

How to use multiple Terraform Providers sequentially

How can I get Terraform 0.10.1 to support two different providers without having to run 'terraform init' every time for each provider?
I am trying to use Terraform to
1) Provision an API server with the 'DigitalOcean' provider
2) Subsequently use the 'Docker' provider to spin up my containers
Any suggestions? Do I need to write an orchestrating script that wraps Terraform?
Terraform's current design struggles with creating "multi-layer" architectures in a single configuration, due to the need to pass dynamic settings from one provider to another:
resource "digitalocean_droplet" "example" {
# (settings for a machine running docker)
}
provider "docker" {
host = "tcp://${digitalocean_droplet.example.ipv4_address_private}:2376/"
}
As you saw in the documentation, passing dynamic values into provider configuration doesn't fully work. It does actually partially work if you use it with care, so one way to get this done is to use a config like the above and then solve the "chicken-and-egg" problem by forcing Terraform to create the droplet first:
$ terraform plan -out=tfplan -target=digitalocean_droplet.example
The above will create a plan that only deals with the droplet and any of its dependencies, ignoring the docker resources. Once the Docker droplet is up and running, you can then re-run Terraform as normal to complete the setup, which should then work as expected because the Droplet's ipv4_address_private attribute will then be known. As long as the droplet is never replaced, Terraform can be used as normal after this.
Using -target is fiddly, and so the current recommendation is to split such systems up into multiple configurations, with one for each conceptual "layer". This does, however, require initializing two separate working directories, which you indicated in your question that you didn't want to do. This -target trick allows you to get it done within a single configuration, at the expense of an unconventional workflow to get it initially bootstrapped.
Maybe you can use a provider instance within your resources/module to set up various resources with various providers.
https://www.terraform.io/docs/configuration/providers.html#multiple-provider-instances
The doc talks about multiple instances of same provider but I believe the same should be doable with distinct providers as well.
A little bit late...
Well, got the same Problem. My workaround is to create modules.
First you need a module for your docker Provider with an ip variable:
# File: ./docker/main.tf
variable "ip" {}
provider "docker" {
host = "tcp://${var.ip}:2375/"
}
resource "docker_container" "www" {
provider = "docker"
name = "www"
}
Next one is to load that modul in your root configuration:
# File: .main.tf
module "docker01" {
source = "./docker"
ip = "192.169.10.12"
}
module "docker02" {
source = "./docker"
ip = "192.169.10.12"
}
True, you will create on every node the same container, but in my case that's what i wanted. I currently haven't found a way to configure the hosts with an individual configuration. Maybe nested modules, but that didn't work in the first tries.

Resources