Retrieve superior hash-key name in Hiera - puppet

Hallo I am building in Hiera / Puppet a data structure for creating mysql / config files. My goal ist to have some default values which can be overwritten with a merge. It works until this point.
Because we have different mysql instances on many hosts I want to automaticly configure some paths to be unique for every instance. I have the instance name as a hash (name) of hashes in the Namespace: our_mysql::configure_db::dbs:
In my case I want to lookup the instance names like "sales_db' or 'hr_db' in paths like datadir, but I can not find a way to lookup the superior keyname.
Hiera data from "our_mysql" module represents some default values:
our_mysql::configure_db::dbs:
'defaults':
datadir: /var/lib/mysql/"%{lookup('lookup to superior hash-key name')}"
log_error: /var/log/mysql/"%{lookup('lookup to superior hash-key name')}".log
logbindir: /var/lib/mysql/"%{lookup('lookup to superior hash-key name')}"
db_port: 3306
...: ...
KEY_N: VALUE_N
Hiera data from node definiton:
our_mysql::configure_db::dbs:
'sales_db':
db_port: "3317"
innodb_buffer_pool_size: "1"
innodb_log_file_size: 1GB
innodb_log_files_in_group: "2"
server_id: "1"
'hr_db':
db_port: "3307"
I now how to do simple lookups or to iterate by
.each | String $key, Hash $value | { ... }
but I have no clue how to reference a key from a certain hierarchy level. Searching all related topics to puppet and hiera didn't help.
Is it possible an any way and if yes how?

As I understand the question, I think what you hope to achieve is that, for example, when you look up our_mysql::configure_db::dbs.sales_db key, you get a merge of the data for that (sub)key and those for the our_mysql::configure_db::dbs.defaults subkey, AND that the various %{lookup ...} tokens in the latter somehow resolve to the string sales_db.
I'm afraid that's not going to happen. The interpolation tokens don't even factor in here -- Hiera simply won't perform such a merge at all. I guess you have a hash-merge lookup in mind, but that merges only identical keys and subkeys, so not our_mysql::configure_db::dbs.sales_db and our_mysql::configure_db::dbs.defaults. Hiera provides for defaults for particular keys in the form of data recorded for those specific keys at a low-priority level of the data hierarchy. The "defaults" subkey you present, on the other hand, has no special meaning to the standard Hiera data providers.
You can still address this problem, just not entirely within the data. For example, consider this:
$dbs = lookup('our_mysql::configure_db::dbs', Hash, 'deep')
$dbs.filter |$dbname, $dbparms| { $dbname != 'defaults' }.each |$dbname, $dbparms| {
# Declare a database using a suitable resource type. "my_mysql::database" is
# a dummy resource name for the purposes of this example only
my_mysql::database {
$dbname:
* => $dbparams;
default:
datadir => "/var/lib/mysql/${dbname}",
log_error => "/var/log/mysql/${dbname}.log",
logbindir => "/var/lib/mysql/${dbname}",
* => $dbs['defaults'];
}
}
That supposes data of the form presented in the question, and it uses the data from the defaults subkey where those do not require knowledge of the specific DB name, but it puts the patterns for various directory names into the resource declaration, instead of into the data. The most important things to recognize are the use of the splat * parameter wildcard for obtaining multiple parameters from a hash, and the use per-expression resource property defaults by use of the default keyword in a resource declaration.
If you wanted to do so, you could push more details of the directory names back into the data with a little more effort (and one or more new keys).

Related

Is DiffSuppressFunc or being more restrictive when saving to TF state is preferable in Terraform SDKv2?

context: I'm adding a new resource to TF Provider (using SDKv2) with roughly the following schema:
resource "player" "football" {
type = "FOOTBALL"
...
config = {
"dribbling" = "50"
"speed" = "90"
"position" = "GOALKEEPER"
}
}
that I represent as:
"config": {
Type: schema.TypeMap,
Elem: &schema.Schema{
Type: schema.TypeString,
},
Required: true,
ForceNew: true,
},
The important detail here for different palyer instances' types there'll be a different set of required attributes (dribbling, speed, position for football and height, can_dunk, arm_span for basketball) -- all players share the same API endpoint so I introduced just one resource to cover them all.
I'd like to support the ability of importing players and apparently READ response includes a bunch of fields that are optional on create (and I suspect most of the users won't have them in Terraform configuration file) which results in the fact that I've got a state difference when saving the whole config like:
d.Set("config", player.GetConfig()) # GetConfig includes a bunch of new attributes (optional on a create or even computed)
So I've got a question: which of the following 2 options is preferable:
Implement DiffSuppressFunc for a config attribute where I'll be ignoring these optional fields (the downside is I'll have an implicit drift between main.tf and TF state file).
Be more restrictive when writing configs to TF state file:
instead of
d.Set("config", player.GetConfig())
# filtered config will match config in main.tf
filteredConfig = ...
d.Set("config", filteredConfig)
In some other Terraform providers that deal with similar situations (where a particular argument has a mixture of configuration-provided and remote-system-provided nested values), the resource type implementation takes a compromise position of effectively exposing the same data in two different attributes, where one of them represents what the user configured and the other represents the full data returned by the remote system. For example, you might have config to be set in the configuration, and expanded_config representing the full set of elements the server decided.
There is a challenge with that approach in that you'll probably need a special rule in your Read function to somehow decide if a change you detect in the remote system constitutes "drift" relative to the configuration or if it's just an additional element added by the server.
From what you described it seems like the rule could be that any key that's present in config in the prior state (that is, the values visible to d.Get inside Read before you call d.Set) would have its value overwritten by what the server returned, but any keys that were not present before are ignored entirely. This would create the effect then that any key the author specified in the configuration is considered "managed by Terraform" while any other key is only read by Terraform and not directly managed.
If you adopt that strategy then it's worth keeping in mind what will happen in a situation where the user has changed the configuration to include a new key or to remove a previously-present key. The Read operation is in terms of the previous state rather than the configuration, so that function will see the keys that were present at the end of the last apply, not the keys currently present in the configuration. In particular this means that if an author adds a new key that the server was already tracking then it will appear in the subsequent plan as being added, even though it might technically be more appropriate to show it as an in-place update ~ or a no-op. This is an example of the compromises we sometimes need to make in order to adapt remote APIs to fit within Terraform's model of resource instances.

How to design a Terraform resource with multiple sensitive attributes

Context: I'm developing a new resource for my TF Provider.
This foo resource has a name and associated config: a list of key value pairs (both sensitive and non-sensitive).
There're 3 options I've identified:
resource "foo" "option1" {
name = "option1"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
}
config_sensitive = {
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option2" {
name = "option2"
config = {
"name" = "option1"
"errors.length" = 3
"tasks.type" = "FOO"
"jira.key" = "..."
"credentials.json" = "..."
}
}
resource "foo" "option3" {
name = "option3"
config = file("config.json")
}
The advantage of option #3 is it looks very readable but requires a user to store an extra json file (with secrets) in the same folder (I'm not sure how acceptable that setup is). Option #2 looks tempting but foo should accept updates and if we mark the whole block as sensitive (since it may contain secret key-value pairs), the update functionality will suffer (user won't see the expected change). So Option #1 is the winner in my eyes since it's the most explicit one and allows us to distinguish between sensitive and non-sensitive attributes (while allowing updates for non-sensitive ones). Reading from file the whole config is probably not ideal since it doesn't really allow an engineer to see how the config looks like without opening another file.
There's also this weird duplicated name attribute but let's ignore it for now.
What configuration is the most acceptable and used by other TF Providers?
Option #3 should be struck immediately for three reasons:
You cannot realiably use the sensitive flag in the schema struct like you can with 1 and 2.
It requires a JSON format value which is cumbersome to work with unless you are forced into it (e.g. security policies).
Someone could inline the JSON and not store it in a file, which would completely workaround your attempt to obscure the secrets.
Options 1 and 2 are honestly no different from a secrets management perspective. You could apply the sensitive flag to either in the nested schema struct on a per-attribute basis, and use e.g. Vault to pass in values on a KV basis for either.
I would opt for 1 over 2 simply because it appears to me from your question that the arguments and values in the two blocks have no relationship with each other. Therefore, it makes more sense to organize your schema into two separate blocks for code cleanliness purposes.
I will also mention that if it is possible to refactor the credentials.json into your provider, and leverage the JIRA provider for the jira.key, then that would be best practices by both code architecture and security. It is also how the major providers handle this situation.
Terraform providers should handle the credential/auth implementation and the resource handles the resource configuration.
e.g.
resource "jira_issue" "some_story" {
title = "My story"
type = "story"
labels = ["someexampleonstackoverflow","jakewashere"]
}
Notice there's no config that doesn't relate to the thing I'm creating inside the Terraform resource.
It's very acceptable to have some documented convention in your provider that reads credentials from somewhere, whether that's an OS variable, file on disk etc.
For example: The Google Cloud provider, will read an environment variable if it's populated, if not it'll attempt to read either a configuration file that sits inside a hidden directory within $HOME or attempts to read a localhost http metadata server for the credentials.

How to use the sops provider with terraform using an array instead an single value

I'm pretty new to Terraform. I'm trying to use the sops provider plugin for encrypting secrets from a yaml file:
Sops Provider
I need to create a Terraform user object for a later provisioning stage like this example:
users = [{
name = "user123"
password = "password12"
}]
I've prepared a secrets.values.enc.yaml file for storing my secret data:
yaml_users:
- name: user123
password: password12
I've encrypted the file using "sops" command. I can decrypt the file successfully for testing purposes.
Now I try to use the encrypted file in Terraform for creating the user object:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
# user data decryption
users = yamldecode(data.sops_file.test-secret.raw).yaml_users
Unfortunately I cannot debug the data or the structure of "users" as Terraform doesn't display sensitive data. When I try to use that users variable for the later provisioning stage than it doesn't seem to be what is needed:
Cannot use a set of map of string value in for_each. An iterable
collection is required.
When I do the same thing with the unencrypted yaml file everything seems to be working fine:
users = yamldecode(file("secrets.values.dec.yaml")).yaml_users
It looks like the sops provider decryption doesn't create an array or that "iterable collection" that I need.
Does anyone know how to use the terraform sops provider for decrypting an array of key-value pairs? A single value like "adminpassword" is working fine.
I think the "set of map of string" part of this error message is the important part: for_each requires either a map directly (in which case the map keys become the instance identifiers) or a set of individual strings (in which case those strings become the instance identifiers).
Your example YAML file shows yaml_users being defined as a YAML sequence of maps, which corresponds to a tuple of objects on conversion with yamldecode.
To use that data structure with for_each you'll need to first project it into a map whose keys will serve as the unique identifier for each instance of the resource. Assuming that the name values are suitably unique, you could project it so that those values are the keys:
data "sops_file" "test-secret" {
source_file = "secrets.values.enc.yaml"
}
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
u.name => u
})
}
The result being a sensitive value adds an extra wrinkle here, because Terraform won't allow using a sensitive value as the identifier for an instance of a resource -- to do so would make it impossible to show the resource instance address in the UI, and impossible to describe the instance on the command line for commands that need that.
However, this does seem like exactly the use-case shown in the example of the nonsensitive function at the time I'm writing this: you have a collection that is currently wholly marked as sensitive, but you know that only parts of it are actually sensitive and so you can use nonsensitive to explain to Terraform how to separate the nonsensitive parts from the sensitive parts. Here's an updated version of the locals block in my previous example using that function:
locals {
users = tomap({
for u in yamldecode(data.sops_file.test-secret.raw).yaml_users :
nonsensitive(u.name) => u
})
}
If I'm making a correct assumption that it's only the passwords that are sensitive and that the usernames are okay to disclose, the above will produce a suitable data structure where the usernames are visible in the keys but the individual element values will still be marked as sensitive.
local.users then meets all of the expectations of resource for_each, and so you should be able to use it with whichever other resources you need to repeat systematically for each user.
Please note that Terraform's tracking of sensitive values is for UI purposes only and will not prevent this passwords from being saved in the state as a part of whichever resources make use of them. If you use Terraform to manage sensitive data then you should treat the resulting state snapshots as sensitive artifacts in their own right, being careful about where and how you store them.

terraform to append consul_key values in json

I have a project on which I have to use terraform and in the end of the terraform, I need to append consul key's values places on a /path. I have the following:
resource "consul_keys" "write" {
datacenter = "dc1"
token = "xxxx-x-x---xxxxxx--xx-x-x-x"
key {
path = "path/to/name"
value = jsonencode([
{
cluster_name = "test", "region" : "us-east1"
},
{
cluster_name = "test2", "region" : "us-central1"
}
])
}
}
But if I run the terraform again with new values, it deletes all previous values and update new values.
Any way I can keep appending the values keeping previous values as it is?
The consul_keys resource type in the hashicorp/consul provider only supports situations where it is responsible for managing the entirety of the value of each of the given keys. This is because the underlying Consul API itself treats each key as a single atomic unit, and doesn't support partial updates of the sort you want to achieve here.
If you are able to change the design of the system that is consuming these values, a different way to get a comparable result would be to set aside a particular key prefix as a collection of values that the consumer will merge together after reading them. Consul's Read Key API includes a mode recurse=true which allows you to provide a prefix to read all of the entries with a given prefix in a single request.
By designing your key structure this way, you can use a separate keys for the data that Terraform will provide and the data provided by each other system that will generate data under this shared prefix. These different systems can therefore each maintain their own designated sub-key and not need to take any special extra steps to preserve existing data already stored at that location.
If you are using consul-template then consul-template's ls function wraps the multi-key lookup I described above.
If you are reading the data from Consul in some other Terraform configuration, the consul_key_prefix data source similarly implements the operation of fetching all key/value pairs under a given prefix.

Overwrite Puppet Class Variables in Manifest

I'm currently using hiera to set all my class parameters for the Puppet forge Gitlab module.
cat hieradata/nodes/example.yaml
---
gitlab::backup_cron_enable: true
gitlab::gitlab_rails:
backup_keep_time: 604800
backup_path: /opt/gitlab_backup
gitlab_default_can_create_group: false
initial_root_password: foobar
...
cat site/profiles/manifests/gitlab.rb
class profile::gitlab {
include gitlab
}
This code works as intended but I'd like to redact the password values in the log output and reports.
I tried to use hiera_options to convert the sensitive values but Puppet still displays the unredacted values.
cat hieradata/nodes/example.yaml
---
lookup_options:
gitlab::gitlab_rails::initial_root_password:
convert_to: "Sensitive"
gitlab::backup_cron_enable: true
gitlab::gitlab_rails:
backup_keep_time: 604800
backup_path: /opt/gitlab_backup
gitlab_default_can_create_group: false
initial_root_password: foobar
...
What is the best way to redact all sensitive values whilst using hiera to define the class parameters?
You need to have the password as a separate key in order for the auto conversion to take effect. The key that is looked up is bound to a hash, and it is not possible to address individual values in a hash with lookup_options (it is the entire hash that is looked up).
You can make an individual value Sensitive by using an alias and binding the password in clear text to a separate key - like this:
cat hieradata/nodes/example.yaml
---
lookup_options:
gitlab::gitlab_rails::initial_root_password:
convert_to: "Sensitive"
gitlab::backup_cron_enable: true
gitlab::gitlab_rails:
backup_keep_time: 604800
backup_path: /opt/gitlab_backup
gitlab_default_can_create_group: false
initial_root_password: '%{alias("gitlab::gitlab_rails::initial_root_password")}'
gitlab::gitlab_rails::initial_root_password: 'foobar'
...
With this approach you could also use EYAML or some other secure hiera backend to store the password in encrypted form. Such a backend may already return decrypted values wrapped in Sensitive - this is for example done by the Vault backend.
However, even if you get past the first hurdle, the result depends on what the gitlab module does with the hash now containing a Sensitive value. If it just passes the value for initial_root_password on it may work, but if it is doing any operation on this value (like checking if it is an empty string for example) it may fail. If you are unlucky it may seem to work but you may end up with the password "redacted" :-). Contact the maintainers of the module if it does not work and request that they support having the password as a Sensitive value instead of a String.

Resources