Is it possible to access all resources in a terraform remote state that aren't declared as outputs

Is it possible to access all resources in a terraform remote state that aren't declared as outputs - terraform

I am trying to get some references from terraform remote state, and have noticed some differences between terraform state resources / data, and using a terraform_remote_state data object.
For example, I have a terraform module that created an AWS managed directory, with no outputs. Within that module, I can see all resources in the state - e.g. terraform state show aws_directory_service_directory.ad gives me details on the directory - the directory ID, DNS server addresses, etc.
$ terraform state list
aws_directory_service_directory.ad
$ terraform state show aws_directory_service_directory.ad
# aws_directory_service_directory.ad:
resource "aws_directory_service_directory" "ad" {
access_url = "REDACTED"
alias = "REDACTED"
dns_ip_addresses = []
.... etc ....
}
If I then create a new module and add a terraform_remote_state data object, i cannot access any properties of the directory - data.terraform_remote_state.ad.outputs is empty. From within this new module, if I only have the remote state data object, and apply (with no resources), and then use terraform console and show data.terraform_remote_state.ad, it looks like:
$ terraform console
> data.terraform_remote_state.ad
{
"backend" = ".."
"config" = { remote_state config shown here }
"outputs" = {}
}
So the resources are in the state, but not accessible directly. Is this expected behaviour? Is there any way to access the resources in the remote state, or would I need to add attributes into outputs and use data.terraform_remote_state.ad.outputs.whatever_attributes?

You can only access outputs. From docs:
terraform_remote_state only exposes output values
You have to modify the parent module of the other setup and add required outputs.
The other way would be to develop your own, fully custom data source to provide info that you need.

Related

How to manage locally generated stateful files in Terraform

I have a Terraform (1.0+) script that generates a local config file from a template based on some inputs, e.g:
locals {
config_tpl = templatefile("${path.module}/config.tpl", {
foo = "bar"
})
}
resource "local_file" "config" {
content = local._config_tpl
filename = "${path.module}/config.yaml"
}
This file is then used by a subsequent command run from a local-exec block, which in turn also generates local config files:
resource "null_resource" "my_command" {
provisioner "local-exec" {
when = create
command = "../scripts/my_command.sh"
working_dir = "${path.module}"
}
depends_on = [
local_file.config,
]
}
my_command.sh generates infrastructure for which there is no Terraform provider currently available.
All of the generated files should form part of the configuration state, as they are required later during upgrades and ultimately to destroy the environment.
I also would like to run these scripts from a CI/CD pipeline, so naturally you would expect the workspace to be clean on each run, which means the generated files won't be present.
Is there a pattern for managing files such as these? My initial though is to create cloud storage bucket, zip the files up, and store them there before pulling them back down whenever they're needed. However, this feels even more dirty than what is already happening, and it seems like there is the possibility to run into dependency issues.
Or, am I missing something completely different to solve issues such as this?

The problem you've encountered here is what the warning in the hashicorp/local provider's documentation is discussing:
Terraform primarily deals with remote resources which are able to outlive a single Terraform run, and so local resources can sometimes violate its assumptions. The resources here are best used with care, since depending on local state can make it hard to apply the same Terraform configuration on many different local systems where the local resources may not be universally available. See specific notes in each resource for more information.
The short and unfortunate answer is that what you are trying to do here is not a problem Terraform is designed to address: its purpose is to manage long-lived objects in remote systems, not artifacts on your local workstation where you are running Terraform.
In the case of your config.yaml file you may find it a suitable alternative to use a cloud storage object resource type instead of local_file, so that Terraform will just write the file directly to that remote storage and not affect the local system at all. Of course, that will help only if whatever you intend to have read this file is also able to read from the same cloud storage, or if you can write a separate glue script to fetch the object after terraform apply is finished.
There is no straightforward path to treating the result of a provisioner as persistent data in the state. If you use provisioners then they are always, by definition, one-shot actions taken only during creation of a resource.

How to point in Terraform which Cloudfront to use?

As an example:
I am deploying Terraform module in us-east-1 which will build the infrastructure + cloudfront distribution. Now the same module will be deployed in us-west-1 as part of the Disaster Recovery region. Now since cloudfront it is a global service, how I can point in Terraform module which will be deployed in us-west-1 to use the existing cloudfront?

If these are two modules within the same configuration (i.e. both modules are applied with the same terraform apply command), then you simply pass the aws_cloudfront_distribution resource out of the module where it's created as an output, and pass it into the other module as an input parameter. E.g.:
module1/main.tf
resource "aws_cloudfront_distribution" "mydist" {
...
}
output "mydist" {
value = aws_cloudfront_distribution.mydist
}
module2/main.tf
variable "mydist" {}
main.tf
module "mod1" {
...
}
module "mod2" {
...
mydist = mod1.mydist
}
And now you can access the CloudFront distribution resource from within module2 by using var.mydist.
If these are two modules in entirely separate Terraform configurations, you can either:
Use the aws_cloudfront_distribution data source to get the details about a distribution that was created in a separate configuration
output the distribution from the configuration where it's created, then use the terraform_remote_state data source to retrieve the output from the remote state file.

terraform lifecycle prevent destroy

I am working with Terraform V11 and AWS provider; I am looking for a way to prevent destroying few resources during the destroy phase. So I used the following approach.
lifecycle {
prevent_destroy = true
}
When I run a "terraform plan" I get the following error.
the plan would destroy this resource, but it currently has
lifecycle.preven_destroy set to true. to avoid this error and continue with the plan.
either disable or adjust the scope.
All that I am looking for is a way to avoid destroying one of the resources and its dependencies during the destroy command.

AFAIK This feature is not yet supported
You need to remove that resource from state file and then reimport it
terraform plan | grep <resource> | grep id
terraform state rm <resource>
terraform destroy
terraform import <resource> <ID>

The easiest way to do this would be to comment out all of the the resources that you want to destroy and then do a terraform apply.

I've found the most practical way to manage this is through a combination of variables that allow the resource in question to be conditionally created or not on via the use of count, alongside having all other resources depend on the associated Data Source instead of the conditionally created resource.
A good example of this is a Route 53 Hosted Zone which can be a pain to destroy and recreate if you manage your domain outside of AWS and need to update your nameservers, waiting for DNS propagation each time you spin it up.
1. By specifying some variable
variable "should_create_r53_hosted_zone" {
type = bool
description = "Determines whether or not a new hosted zone should be created on apply."
}
2. you can use it alongside count on the resource to conditionally create it.
resource "aws_route53_zone" "new" {
count = var.should_create_r53_hosted_zone ? 1 : 0
name = "my.domain.com"
}
3. Then, by following up with a call to the associated Data Source
data "aws_route53_zone" "existing" {
name = "my.domain.com"
depends_on = [
aws_route53_zone.new
]
}
4. you can give all other resources consistent access to the resource's attributes regardless of whether or not your flag has been set.
resource "aws_route53_record" "rds_reader_endpoint" {
zone_id = data.aws_route53_zone.existing.zone_id
# ...
}
This approach is only slightly better than commenting / uncommenting resources during apply, but at least gives some consistent, documented way of working around it.

Can I use variables in the TerraForm main.tf file?

Ok, so I have three .tf-files: main.tf where I state azure as provider, resources.tf where all the my resources are claimed, and variables.tf.
I use variables.tf to store keys used by resources.tf.
However, I want to use variables stored in my variable file to fill in the fields in the backend scope like this:
main.tf:
provider "azurerm" {
version = "=1.5.0"
}
terraform {
backend "azurerm" {
storage_account_name = "${var.sa_name}"
container_name = "${var.c_name}"
key = "${var.key}"
access_key = "${var.access_key}"
}
}
Variables stored in variables.tf like this:
variable "sa_name" {
default = "myStorageAccount"
}
variable "c_name" {
default = "tfstate"
}
variable "key" {
default = "codelab.microsoft.tfstate"
}
variable "access_key" {
default = "weoghwoep489ug40gu ... "
}
I got this when running terraform init:
terraform.backend: configuration cannot contain interpolations
The backend configuration is loaded by Terraform extremely early,
before the core of Terraform can be initialized. This is necessary
because the backend dictates the behavior of that core. The core is
what handles interpolation processing. Because of this, interpolations
cannot be used in backend configuration.
If you'd like to parameterize backend configuration, we recommend
using partial configuration with the "-backend-config" flag to
"terraform init".
Is there a way of solving this? I really want all my keys/secrets in the same file... and not one key in the main which I preferably want to push to git.

Terraform doesn't care much about filenames: it just loads all .tf files in the current directory and processes them. Names like main.tf, variables.tf, and outputs.tf are useful conventions to make it easier for developers to navigate the code, but they won't have much impact on Terraform's behavior.
The reason you're seeing the error is that you're trying to use variables in a backend configuration. Unfortunately, Terraform does not allow any interpolation (any ${...}) in backends. Quoting from the documentation:
Only one backend may be specified and the configuration may not contain interpolations. Terraform will validate this.
So, you have to either hard-code all the values in your backend, or provide a partial configuration and fill in the rest of the configuration via CLI params using an outside tool (e.g., Terragrunt).

There are some important limitations on backend configuration:
A configuration can only provide one backend block.
A backend block cannot refer to named values (like input variables, locals, or data source attributes).
Terraform backends configurations one can see at below link:
https://www.terraform.io/docs/configuration/backend.html

How to manage terraform for multiple repos

I have 2 repos for my project. A Static website and server. I want the website to be hosted by cloudfront and s3 and the server on elasticbeanstalk. I know these resources will need to know about a route53 resource at least to be under the same domain name for cors to work. Among other things such as vpcs and stuff.
So my question is how do I manage terraform with multiple repos.
I'm thinking I could have a seperate infrastructure repo that builds for all repos.
I could also have them seperate and pass in the arns/names/ids as variables (annoying).

You can use terraform remote_state for this. It lets you read the output variables from another terraform state file.
Lets assume you save your state files remotely on s3 and you have your website.tfstate and server.tfstate file. You could output your hosted zone ID of your route53 zone as hosted_zone_id in your website.tfstate and then reference that output variable directly in your server state terraform code.
data "terraform_remote_state" "website" {
backend = "s3"
config {
bucket = "<website_state_bucket>"
region = "<website_bucket_region>"
key = "website.tfstate"
}
}
resource "aws_route53_record" "www" {
zone_id = "${data.terraform_remote_state.website.hosted_zone_id}"
name = "www.example.com"
type = "A"
ttl = "300"
records = ["${aws_eip.lb.public_ip}"]
}
Note, that you can only read output variables from remote states. You cannot access resources directly, as terraform treats other states/modules as black boxes.
Update
As mentioned in the comments, terraform_remote_state is a simple way to share explicitly published variables across multiple states. However, it comes with 2 issues:
Close coupling between code components, i.e., producer of the variable cannot change easily.
It can only be used by terraform, i.e., you cannot easily share those variables across different layers. Configuration tools such as Ansible cannot use .tfstate natively without some additional custom plugin/wrapper.
The recommended HashiCorp way is to use a central config store such as Consul. It comes with more benefits:
Consumer is decoupled from the variable producer.
Explicit publishing of variables (like in terraform_remote_state).
Can be used by other tools.
A more detailed explanation can be found here.

An approach I've used in the past is to have a single repo for all of the Infrastructure.
An alternative is to have 2 separate tf configurations, each using remote state. Config 1 can use output variables to store any arns/ids as necessary.
Config 2 can then have a remote_state data source to query for the relevant arns/ids.
E.g.
# Declare remote state
data "terraform_remote_state" "network" {
backend = "s3"
config {
bucket = "my-terraform-state"
key = "network/terraform.tfstate"
region = "us-east-1"
}
}
You can then use output values using standard interpolation syntax
${data.terraform_remote_state.network.some_id}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string