terraform conditionally create resource based on external data? - terraform

As part of a setup, I create TLS certs and store them in S3. Creating the certs is done via external data source that runs the command to generate the certs. I then use those outputs to create S3 bucket object resources.
This works very well the first time I run terraform apply. However, if I change any other (non-cert) variable, resource, etc. and rerun, it reruns the external command, which generates a new key/cert pair, uploads them to S3, and breaks everything that already works.
Is there any way to create the resource conditionally? What pattern could I use to make the certs created only if they don't exist?
I did look at storing the generated keys/certs locally, but this is sensitive key material; I do not want it stored in local disk (and there are keys per environment).
Key/cert generation and storage:
data "external" "ca" {
program = ["sh","-c","jq '.root|fromjson' | cfssl gencert -initca -"]
#
query = {root = "${ data.template_file.etcd-ca-csr.rendered }"}
# the result will be saved in
# data.external.etcd-ca.result.key
# data.external.etcd-ca.result.csr
# data.external.etcd-ca.result.cert
}
resource "aws_s3_bucket_object" "ca_cert" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca.pem"
content = "${data.external.ca.result.cert}"
}
resource "aws_s3_bucket_object" "ca_key" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca-key.pem"
content = "${data.external.ca.result.key}"
}
Happy to look at using some form of conditional or entirely different generation pattern.

The reason for this behavior is that external is a data source, and thus Terraform expects that it is is read-only and side-effect-free. It re-runs data sources for every plan.
In order to do this via an external script, it would be necessary to use a resource provisioner to run the script and upload it to S3, since there is currently no external equivalent for resources, which are allowed to have side-effects, and provisioners are side-effect-only (that is, they can't produce results to use elsewhere in config.)
Another approach, though, would be to use Terraform's built-in TLS provider, which allows creation of certificates within Terraform itself. In this case it looks like you're trying to create a new CA cert and key, which could be done with tls_self_signed_cert like this:
resource "tls_private_key" "ca" {
algorithm = "RSA"
rsa_bits = 2048
}
resource "tls_self_signed_cert" "ca" {
key_algorithm = "RSA"
private_key_pem = "${tls_private_key.ca.private_key_pem}"
# ... subject and validity settings, as appropriate
is_ca_certificate = true
allowed_uses = ["cert_signing"]
}
resource "aws_s3_bucket_object" "ca_cert" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca.pem"
content = "${resource.tls_self_signed_cert.ca.cert_pem}"
}
resource "aws_s3_bucket_object" "ca_key" {
bucket = "${aws_s3_bucket.my_bucket.id}"
key = "ca-key.pem"
content = "${resource.tls_self_signed_cert.ca.private_key_pem}"
}
The generated private key will be included in the state for use on future runs, so it's important to ensure that the state is stored securely. Note that this would also be true using the external data source, since data source results are also stored in state. Thus this approach is equivalent from the standpoint of where the secrets get stored.
I wrote more details about using Terraform for TLS certificate management in an article on my website. Its scope is broader than your requirements here, but may be of some interest.

Related

How to design a Terraform resource with multiple sensitive attributes

Context: I'm developing a new resource for my TF Provider.
This foo resource has a name and associated config: a list of key value pairs (both sensitive and non-sensitive).
There're 3 options I've identified:
resource "foo" "option1" {
name = "option1"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
}
config_sensitive = {
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option2" {
name = "option2"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option3" {
name = "option3"
config = file("config.json")
}
The advantage of option #3 is it looks very readable but requires a user to store an extra json file (with secrets) in the same folder (I'm not sure how acceptable that setup is). Option #2 looks tempting but foo should accept updates and if we mark the whole block as sensitive (since it may contain secret key-value pairs), the update functionality will suffer (user won't see the expected change). So Option #1 is the winner in my eyes since it's the most explicit one and allows us to distinguish between sensitive and non-sensitive attributes (while allowing updates for non-sensitive ones). Reading from file the whole config is probably not ideal since it doesn't really allow an engineer to see how the config looks like without opening another file.
There's also this weird duplicated name attribute but let's ignore it for now.
What configuration is the most acceptable and used by other TF Providers?
Option #3 should be struck immediately for three reasons:
You cannot realiably use the sensitive flag in the schema struct like you can with 1 and 2.
It requires a JSON format value which is cumbersome to work with unless you are forced into it (e.g. security policies).
Someone could inline the JSON and not store it in a file, which would completely workaround your attempt to obscure the secrets.
Options 1 and 2 are honestly no different from a secrets management perspective. You could apply the sensitive flag to either in the nested schema struct on a per-attribute basis, and use e.g. Vault to pass in values on a KV basis for either.
I would opt for 1 over 2 simply because it appears to me from your question that the arguments and values in the two blocks have no relationship with each other. Therefore, it makes more sense to organize your schema into two separate blocks for code cleanliness purposes.
I will also mention that if it is possible to refactor the credentials.json into your provider, and leverage the JIRA provider for the jira.key, then that would be best practices by both code architecture and security. It is also how the major providers handle this situation.
Terraform providers should handle the credential/auth implementation and the resource handles the resource configuration.
e.g.
resource "jira_issue" "some_story" {
title = "My story"
type = "story"
labels = ["someexampleonstackoverflow","jakewashere"]
}
Notice there's no config that doesn't relate to the thing I'm creating inside the Terraform resource.
It's very acceptable to have some documented convention in your provider that reads credentials from somewhere, whether that's an OS variable, file on disk etc.
For example: The Google Cloud provider, will read an environment variable if it's populated, if not it'll attempt to read either a configuration file that sits inside a hidden directory within $HOME or attempts to read a localhost http metadata server for the credentials.

How to use the sops provider with terraform using an array instead an single value

I'm pretty new to Terraform. I'm trying to use the sops provider plugin for encrypting secrets from a yaml file:
Sops Provider
I need to create a Terraform user object for a later provisioning stage like this example:
users = [{
name = "user123"
password = "password12"
}]
I've prepared a secrets.values.enc.yaml file for storing my secret data:
yaml_users:
- name: user123
password: password12
I've encrypted the file using "sops" command. I can decrypt the file successfully for testing purposes.
Now I try to use the encrypted file in Terraform for creating the user object:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
# user data decryption
users = yamldecode(data.sops_file.test-secret.raw).yaml_users
Unfortunately I cannot debug the data or the structure of "users" as Terraform doesn't display sensitive data. When I try to use that users variable for the later provisioning stage than it doesn't seem to be what is needed:
Cannot use a set of map of string value in for_each. An iterable
collection is required.
When I do the same thing with the unencrypted yaml file everything seems to be working fine:
users = yamldecode(file("secrets.values.dec.yaml")).yaml_users
It looks like the sops provider decryption doesn't create an array or that "iterable collection" that I need.
Does anyone know how to use the terraform sops provider for decrypting an array of key-value pairs? A single value like "adminpassword" is working fine.
I think the "set of map of string" part of this error message is the important part: for_each requires either a map directly (in which case the map keys become the instance identifiers) or a set of individual strings (in which case those strings become the instance identifiers).
Your example YAML file shows yaml_users being defined as a YAML sequence of maps, which corresponds to a tuple of objects on conversion with yamldecode.
To use that data structure with for_each you'll need to first project it into a map whose keys will serve as the unique identifier for each instance of the resource. Assuming that the name values are suitably unique, you could project it so that those values are the keys:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
u.name => u
})
}
The result being a sensitive value adds an extra wrinkle here, because Terraform won't allow using a sensitive value as the identifier for an instance of a resource -- to do so would make it impossible to show the resource instance address in the UI, and impossible to describe the instance on the command line for commands that need that.
However, this does seem like exactly the use-case shown in the example of the nonsensitive function at the time I'm writing this: you have a collection that is currently wholly marked as sensitive, but you know that only parts of it are actually sensitive and so you can use nonsensitive to explain to Terraform how to separate the nonsensitive parts from the sensitive parts. Here's an updated version of the locals block in my previous example using that function:
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
nonsensitive(u.name) => u
})
}
If I'm making a correct assumption that it's only the passwords that are sensitive and that the usernames are okay to disclose, the above will produce a suitable data structure where the usernames are visible in the keys but the individual element values will still be marked as sensitive.
local.users then meets all of the expectations of resource for_each, and so you should be able to use it with whichever other resources you need to repeat systematically for each user.
Please note that Terraform's tracking of sensitive values is for UI purposes only and will not prevent this passwords from being saved in the state as a part of whichever resources make use of them. If you use Terraform to manage sensitive data then you should treat the resulting state snapshots as sensitive artifacts in their own right, being careful about where and how you store them.

terraform to append consul_key values in json

I have a project on which I have to use terraform and in the end of the terraform, I need to append consul key's values places on a /path. I have the following:
resource "consul_keys" "write" {
datacenter = "dc1"
token = "xxxx-x-x---xxxxxx--xx-x-x-x"
key {
path = "path/to/name"
value = jsonencode([
{
cluster_name = "test", "region" : "us-east1"
},
{
cluster_name = "test2", "region" : "us-central1"
}
])
}
}
But if I run the terraform again with new values, it deletes all previous values and update new values.
Any way I can keep appending the values keeping previous values as it is?
The consul_keys resource type in the hashicorp/consul provider only supports situations where it is responsible for managing the entirety of the value of each of the given keys. This is because the underlying Consul API itself treats each key as a single atomic unit, and doesn't support partial updates of the sort you want to achieve here.
If you are able to change the design of the system that is consuming these values, a different way to get a comparable result would be to set aside a particular key prefix as a collection of values that the consumer will merge together after reading them. Consul's Read Key API includes a mode recurse=true which allows you to provide a prefix to read all of the entries with a given prefix in a single request.
By designing your key structure this way, you can use a separate keys for the data that Terraform will provide and the data provided by each other system that will generate data under this shared prefix. These different systems can therefore each maintain their own designated sub-key and not need to take any special extra steps to preserve existing data already stored at that location.
If you are using consul-template then consul-template's ls function wraps the multi-key lookup I described above.
If you are reading the data from Consul in some other Terraform configuration, the consul_key_prefix data source similarly implements the operation of fetching all key/value pairs under a given prefix.

Terraform plan prints sensitive information

when performing terraform plan, if an azurerm_kubernetes_cluster (Azure) resource exists in the state, terraform will print some information from kube_config which seems sensitive
Example printout: (all ... values get printed)
kube_config = [
{
client_certificate = (...)
client_key = (...)
cluster_ca_certificate = (...)
host = (...)
password = (...)
}
I'm not exactly sure WHICH of those values are sensitive, but password probably is...right?
On the other hand, terraform does seem to have some knowledge of which values are sensitive, as it does print the client_secret this way:
service_principal {
client_id = "(...)"
client_secret = (sensitive value)
}
So, my questions would be:
Are those values actually sensitive?
If so, is there a way to instruct terraform to mask those values in the plan?
Versions we are using:
provider "azurerm" {
version = "~>1.37.0"
}
The reason why this is problematic is that we pipe the plan in a Github PR comment.
Thanks
Are those values actually sensitive?
Yes, there are sensitive data. Actually they are the config that you need to use to control the AKS cluster. It's the AKS credential. I think it's necessary to output these data, just make a suppose that you only have Terraform and use it to create an AKS cluster, if Terraform does not output the credential, you cannot control your AKS cluster.
If so, is there a way to instruct terraform to mask those values in
the plan?
According to the explanation above, you should not wrong about the sensitive data in the Terraform state file. What you need to care about is how to protect the state file. I suggest you store the Terraform state file in Azure storage then you can encrypt it. Follow the steps in Store Terraform state in Azure Storage.
Terraform now offers the ability to set variables as sensitive, and outputs as sensitive.
variable example:
variable "user_information" {
type = object({
name = string
address = string
})
sensitive = true
}
output example:
output "db_password" {
value = aws_db_instance.db.password
description = "The password for logging in to the database."
sensitive = true
}
However, as of July 1, 2021 there is no option to hide plan output for something that isn't derived from a sensitive input.
References:
https://www.hashicorp.com/blog/terraform-0-14-adds-the-ability-to-redact-sensitive-values-in-console-output
https://www.terraform.io/docs/language/values/outputs.html

resource output value from one plan into another plan

I have two plans, in which I am creating two different servers(just for example otherwise it's really complex). In one plan, I am outputing the value of the security group like this:
output "security_group_id" {
value = "${aws_security_group.security_group.id}"
}
I have second plan, in which I want to use that value, how I can achieve it, I have tried couple of things but nothing work for me.
I know how to use the output value return by module but don't know that how I can use the output of one plan to another.
When an output is used in the top-level module of a configuration (the directory where you run terraform plan) its value is recorded in the Terraform state.
In order to use this value from another configuration, the state must be published to a location where it can be read by the other configuration. The usual way to achieve this is to use Remote State.
With remote state enabled for the first configuration, it becomes possible to read the resulting values from the second configuration using the terraform_remote_state data source.
For example, it's possible to keep the state for the first configuration in Amazon S3 by using a backend configuration like the following:
terraform {
backend "s3" {
bucket = "example-s3-bucket"
key = "example-bucket-key"
region = "us-east-1"
}
}
After adding this to the first configuration, Terraform will prompt you to run terraform init to initialize the new backend, which includes migrating the existing state to be stored on S3.
Then in the second configuration this can be retrieved by providing the same configuration to the terraform_remote_state data source:
data "terraform_remote_state" "example" {
backend = "s3"
config {
bucket = "example-s3-bucket"
key = "example-bucket-key"
region = "us-east-1"
}
}
resource "aws_instance" "foo" {
# ...
vpc_security_group_ids = "${data.terraform_remote_state.example.security_group_id}"
}
Note that since the second configuration is reading the state from the first it is necessary to terraform apply the first configuration so that this value will actually be recorded in the state. The second config must be re-applied any time the outputs are changed in the first.
For the local backend the process is same. In first step, we need to declare the following code snippet to publish the state.
terraform {
backend local {
path = "./terraform.tfstate"
}
}
When you execute terraform init and terraform apply command, please observe that in .terraform directory new terraform.tfsate file would be created which contains backend information and tell terraform to use the following tfstate file.
Now in the second configuration we need to use data source to import the outputs by using this code snippet
data "terraform_remote_state" "test" {
backend = "local"
config {
path = "${path.module}/../regionalvpc/terraform.tfstate"
}
}

Resources