Generate secure random bytes in Terraform? - terraform

I know that I can generate random bytes in Terraform easily enough:
resource "random_id" "foo" {
byte_length = 32
}
resource "something_else" "foo" {
secret = sensitive(random_id.foo.b64std)
}
However, the output of the random_id resource is not marked sensitive, and the value is exposed in logs as the resource id, even though its use in something_else is redacted by the sensitive() function.
I know that random_password is treated as secure, but it doesn't provide the ability to generate raw random bytes.
Is there a good way to generate a secure bunch of random bytes as a Terraform-managed resource?
(I'm aware that the value will always be visible in the state file, but we manage that already. I'm worried about output log files that will much more widely visible.)
EDIT: I found a request to mark random_id secure but the idea was rejected as outside the intended use.

Why don't you use the base64encode function instead? Unfortunately Terraform doesn't support this yet(see link I posted below). Something like this might give you some ideas you could use -
resource "random_string" "this" {
length = 32
}
output "random_string_output" {
value = "'${base64encode(random_string.this.result)}'"
}
This GitHub issue talks about this limitation/workaround quite a bit

There is no such resource or data source in terraform. But you could develop your own custom data source that would produce those "truly random bytes" to your satisfaction.

Related

Terraform json/map conversion in dynamic section

I am trying to configure a specific list of user with terraform in a dynamic section.
First, I have all my users / password as a json in a Vault like this:
{
"user1": "longPassword1",
"user2amq": "longPassword2",
"user3": "longPassword3"
}
then I declare the Vault data with
data "vault_kv_secret_v2" "all_clients" {
provider = my.vault.provider
mount = "credentials/aws/amq"
name = "dev/clients"
}
and in a locals section:
locals {
all_clients = tomap(jsondecode(data.vault_kv_secret_v2.all_clients.data.client_list))
}
in my tf file, I declare the dynamic section like this:
dynamic "user" {
for_each = local.all_clients
content {
username = each.key
password = each.value
console_access = "false"
groups = ["users"]
}
}
When I apply my terraform I got an error:
│ on modules/amq/amq.tf line 66, in resource "aws_mq_broker" "myproject":
│ 66: for_each = local.all_clients
│
│ Cannot use a map of string value in for_each. An iterable collection is
│ required.
I tried many ways to manage such a map but always terminating with an error
(like using = instead : and bypassing the jsondecode, or having a Map or a List of Object with
"username": "user1"
"pasword": "pass1"
etc... (I am open to adjust the Json for making it working)
Nothing was working and I am a bit out of idea how to map such a simple thing into terraform. I already check plenty of questions/answers in SO and none are working for me.
Terraform version 1.3.5
UPDATE:
By just putting an output on the local variable outside my module:
locals {
all_clients = jsondecode(data.vault_kv_secret_v2.all_clients.data.client_list)
}
output all_clients {
value = local.all_clients
}
after I applied the code, the command terraform output -json all_clients will show my json structure properly (and same if I put = instead and just displayed as a map, without the jsondecode).
As the answer says, the issue is more related to sensitiveness while declaring the loop.
On the other side, I had to adjust my username not being emails because not supported by AWS AmazonMQ (ActiveMQ) and password field must be greater than 12 chars (max 250 chars).
I think the problem here is something Terraform doesn't support but isn't explaining well: you can't use a map that is marked as sensitive directly as the for_each expression, because doing so would disclose some information about that sensitive value in a way that Terraform can't hide in the UI. At the very least, it would expose the number of elements.
It seems like in this particular case it's overly conservative to consider the entire map to be sensitive, but neither Vault nor Terraform understand the meaning of your data structure and so are treating the whole thing as sensitive just to make sure nothing gets disclosed accidentally.
Assuming that only the passwords in this result are actually sensitive, I think the best answer would be to specify explicitly what is and is not sensitive using the sensitive and nonsensitive functions to override the very coarse sensitivity the hashicorp/vault provider is generating:
locals {
all_clients = tomap({
for user, password in jsondecode(nonsensitive(data.vault_kv_secret_v2.all_clients.data.client_list)) :
user => sensitive(password)
})
}
Using nonsensitive always requires care because it's overriding Terraform's automatic inference of sensitive values and so if you use it incorrectly you might show sensitive information in the Terraform UI.
In this case I first used nonsensitive on the whole JSON string returned by the vault_kv_secret_v2 data source, which therefore avoids making the jsondecode result wholly sensitive. Then I used the for expression to carefully mark just the passwords as sensitive, so that the final value would -- if assigned to somewhere that causes it to appear in the UI -- appear like this:
tomap({
"user1" = (sensitive value)
"user2amq" = (sensitive value)
"user3" = (sensitive value)
})
Now that the number of elements in the map and the map's keys are no longer sensitive, this value should be compatible with the for_each in a dynamic block just like the one you showed. Because the value of each element is sensitive, Terraform will treat the password argument value in particular as sensitive, while disclosing the email addresses.
If possible I would suggest testing this with fake data that isn't really sensitive first, just to make sure that you don't accidentally expose real sensitive data if anything here isn't quite correct.

How to design a Terraform resource with multiple sensitive attributes

Context: I'm developing a new resource for my TF Provider.
This foo resource has a name and associated config: a list of key value pairs (both sensitive and non-sensitive).
There're 3 options I've identified:
resource "foo" "option1" {
name = "option1"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
}
config_sensitive = {
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option2" {
name = "option2"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option3" {
name = "option3"
config = file("config.json")
}
The advantage of option #3 is it looks very readable but requires a user to store an extra json file (with secrets) in the same folder (I'm not sure how acceptable that setup is). Option #2 looks tempting but foo should accept updates and if we mark the whole block as sensitive (since it may contain secret key-value pairs), the update functionality will suffer (user won't see the expected change). So Option #1 is the winner in my eyes since it's the most explicit one and allows us to distinguish between sensitive and non-sensitive attributes (while allowing updates for non-sensitive ones). Reading from file the whole config is probably not ideal since it doesn't really allow an engineer to see how the config looks like without opening another file.
There's also this weird duplicated name attribute but let's ignore it for now.
What configuration is the most acceptable and used by other TF Providers?
Option #3 should be struck immediately for three reasons:
You cannot realiably use the sensitive flag in the schema struct like you can with 1 and 2.
It requires a JSON format value which is cumbersome to work with unless you are forced into it (e.g. security policies).
Someone could inline the JSON and not store it in a file, which would completely workaround your attempt to obscure the secrets.
Options 1 and 2 are honestly no different from a secrets management perspective. You could apply the sensitive flag to either in the nested schema struct on a per-attribute basis, and use e.g. Vault to pass in values on a KV basis for either.
I would opt for 1 over 2 simply because it appears to me from your question that the arguments and values in the two blocks have no relationship with each other. Therefore, it makes more sense to organize your schema into two separate blocks for code cleanliness purposes.
I will also mention that if it is possible to refactor the credentials.json into your provider, and leverage the JIRA provider for the jira.key, then that would be best practices by both code architecture and security. It is also how the major providers handle this situation.
Terraform providers should handle the credential/auth implementation and the resource handles the resource configuration.
e.g.
resource "jira_issue" "some_story" {
title = "My story"
type = "story"
labels = ["someexampleonstackoverflow","jakewashere"]
}
Notice there's no config that doesn't relate to the thing I'm creating inside the Terraform resource.
It's very acceptable to have some documented convention in your provider that reads credentials from somewhere, whether that's an OS variable, file on disk etc.
For example: The Google Cloud provider, will read an environment variable if it's populated, if not it'll attempt to read either a configuration file that sits inside a hidden directory within $HOME or attempts to read a localhost http metadata server for the credentials.

How to use the sops provider with terraform using an array instead an single value

I'm pretty new to Terraform. I'm trying to use the sops provider plugin for encrypting secrets from a yaml file:
Sops Provider
I need to create a Terraform user object for a later provisioning stage like this example:
users = [{
name = "user123"
password = "password12"
}]
I've prepared a secrets.values.enc.yaml file for storing my secret data:
yaml_users:
- name: user123
password: password12
I've encrypted the file using "sops" command. I can decrypt the file successfully for testing purposes.
Now I try to use the encrypted file in Terraform for creating the user object:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
# user data decryption
users = yamldecode(data.sops_file.test-secret.raw).yaml_users
Unfortunately I cannot debug the data or the structure of "users" as Terraform doesn't display sensitive data. When I try to use that users variable for the later provisioning stage than it doesn't seem to be what is needed:
Cannot use a set of map of string value in for_each. An iterable
collection is required.
When I do the same thing with the unencrypted yaml file everything seems to be working fine:
users = yamldecode(file("secrets.values.dec.yaml")).yaml_users
It looks like the sops provider decryption doesn't create an array or that "iterable collection" that I need.
Does anyone know how to use the terraform sops provider for decrypting an array of key-value pairs? A single value like "adminpassword" is working fine.
I think the "set of map of string" part of this error message is the important part: for_each requires either a map directly (in which case the map keys become the instance identifiers) or a set of individual strings (in which case those strings become the instance identifiers).
Your example YAML file shows yaml_users being defined as a YAML sequence of maps, which corresponds to a tuple of objects on conversion with yamldecode.
To use that data structure with for_each you'll need to first project it into a map whose keys will serve as the unique identifier for each instance of the resource. Assuming that the name values are suitably unique, you could project it so that those values are the keys:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
u.name => u
})
}
The result being a sensitive value adds an extra wrinkle here, because Terraform won't allow using a sensitive value as the identifier for an instance of a resource -- to do so would make it impossible to show the resource instance address in the UI, and impossible to describe the instance on the command line for commands that need that.
However, this does seem like exactly the use-case shown in the example of the nonsensitive function at the time I'm writing this: you have a collection that is currently wholly marked as sensitive, but you know that only parts of it are actually sensitive and so you can use nonsensitive to explain to Terraform how to separate the nonsensitive parts from the sensitive parts. Here's an updated version of the locals block in my previous example using that function:
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
nonsensitive(u.name) => u
})
}
If I'm making a correct assumption that it's only the passwords that are sensitive and that the usernames are okay to disclose, the above will produce a suitable data structure where the usernames are visible in the keys but the individual element values will still be marked as sensitive.
local.users then meets all of the expectations of resource for_each, and so you should be able to use it with whichever other resources you need to repeat systematically for each user.
Please note that Terraform's tracking of sensitive values is for UI purposes only and will not prevent this passwords from being saved in the state as a part of whichever resources make use of them. If you use Terraform to manage sensitive data then you should treat the resulting state snapshots as sensitive artifacts in their own right, being careful about where and how you store them.

What problem does the keepers map for the random provider solve?

I am trying to understand the use case for keepers feature of the terraform random provider. I read the docs but it's not clicking-in for me. What is a concrete example, situation where keeper map would be used and why. Example form the docs reproduced below.
resource "random_id" "server" {
keepers = {
# Generate a new id each time we switch to a new AMI id
ami_id = "${var.ami_id}"
}
byte_length = 8
}
resource "aws_instance" "server" {
tags = {
Name = "web-server ${random_id.server.hex}"
}
# Read the AMI id "through" the random_id resource to ensure that
# both will change together.
ami = "${random_id.server.keepers.ami_id}"
# ... (other aws_instance arguments) ...
}
The keepers are seeds for the random string that is generated. They contain data that you can use to ensure, essentially, that your random string is deterministic - until something happens that means it should change.
If you had a random string without any keepers, and you were using it in your server's Name tag as in this example, then Terraform would generate a plan to change the Name (containing a new random ID) every time you ran terraform plan/terraform apply.
This is not desirable, because while you might want randomness when you first create the server, you probably don't want so much randomness that it constantly changes. That is, once you apply your plan, your infrastructure should remain stable and subsequent plans should generate no changes as long as everything else remains the same.
When it comes time to make changes to this server - such as, in this case, changing the image it's built from - you may well want the server name to automatically change to a new random value to represent that this is no longer the same server as before. Using the AMI ID in the keepers for the random ID therefore means that when your AMI ID changes, a new random ID will be generated for the server's Name as well.

Terraform plan prints sensitive information

when performing terraform plan, if an azurerm_kubernetes_cluster (Azure) resource exists in the state, terraform will print some information from kube_config which seems sensitive
Example printout: (all ... values get printed)
kube_config = [
{
client_certificate = (...)
client_key = (...)
cluster_ca_certificate = (...)
host = (...)
password = (...)
}
I'm not exactly sure WHICH of those values are sensitive, but password probably is...right?
On the other hand, terraform does seem to have some knowledge of which values are sensitive, as it does print the client_secret this way:
service_principal {
client_id = "(...)"
client_secret = (sensitive value)
}
So, my questions would be:
Are those values actually sensitive?
If so, is there a way to instruct terraform to mask those values in the plan?
Versions we are using:
provider "azurerm" {
version = "~>1.37.0"
}
The reason why this is problematic is that we pipe the plan in a Github PR comment.
Thanks
Are those values actually sensitive?
Yes, there are sensitive data. Actually they are the config that you need to use to control the AKS cluster. It's the AKS credential. I think it's necessary to output these data, just make a suppose that you only have Terraform and use it to create an AKS cluster, if Terraform does not output the credential, you cannot control your AKS cluster.
If so, is there a way to instruct terraform to mask those values in
the plan?
According to the explanation above, you should not wrong about the sensitive data in the Terraform state file. What you need to care about is how to protect the state file. I suggest you store the Terraform state file in Azure storage then you can encrypt it. Follow the steps in Store Terraform state in Azure Storage.
Terraform now offers the ability to set variables as sensitive, and outputs as sensitive.
variable example:
variable "user_information" {
type = object({
name = string
address = string
})
sensitive = true
}
output example:
output "db_password" {
value = aws_db_instance.db.password
description = "The password for logging in to the database."
sensitive = true
}
However, as of July 1, 2021 there is no option to hide plan output for something that isn't derived from a sensitive input.
References:
https://www.hashicorp.com/blog/terraform-0-14-adds-the-ability-to-redact-sensitive-values-in-console-output
https://www.terraform.io/docs/language/values/outputs.html

Resources