how to extract data from terrafrom state show - terraform

I would like to extract the data given from terraform state show. According to documentation we should use terraform show -json . https://www.terraform.io/docs/cli/commands/state/show.html
The output of terraform state show is intended for human consumption,
not programmatic consumption. To extract state data for use in other
software, use terraform show -json and decode the result using the
documented structure.
Not sure how to use terrafrom state show in conjuction with terrafrom show
$ terraform state show 'packet_device.worker'
# packet_device.worker:
resource "packet_device" "worker" {
billing_cycle = "hourly"
created = "2015-12-17T00:06:56Z"
facility = "ewr1"
hostname = "prod-xyz01"
id = "6015bg2b-b8c4-4925-aad2-f0671d5d3b13"
locked = false
}

The terraform state show command displays information on a single Terraform resource and does not support the -json flag. The command terraform show dumps the entire state, and does support the -json flag. Unlike the output from terraform state show, the output of terraform show -json is documented and intended for programmatic consumption.
If you want to obtain the info on a particular resource as displayed by terraform state show, you can extract it from the full-state JSON, for example by using jq:
terraform show -json | \
jq '.values.root_module.resources[] | select(.address == "packet_device.worker") | .values'
Whether this makes sense depends on what it is you're trying to do.

Related

How to assign certain Terraform Variables via TFVARS file, while others via Terraform Cloud Variable sets

I currently have a dev.auto.tfavrs file with a few dozen variables and their values for my application such as:
DB_NUM_RECORDS_PER_EXECUTION = "Something here"
QUEUE_API_KEY = "Something here"
application = "My Appplication"
Application_Secret_Key = "Some Secret Key here"
All variables are defined in a variables.tf file, with the sensitive ones assigned a blank value.
I run terraform plan in this manner:
terraform apply -var-file=env/dev.auto.tfvars
(I have also qa.auto.tfvars and prod.auto.tfvars for the other environments)
I want to inject certain sensitive values to certain keys such as :
Application_Secret_Key via The Terraform Cloud Variable sets so developers don't have to.
I have added Application_Secret_Key in the TF Cloud Variable set.
but when i run the above terraform plan, Terraform Cloud is not injecting the value stored in the variable set...instead it assigns the blank value as defined in my configuration.
It is my understanding that the auto.tfvars files take precedence and over write the terraform.tfavars file in Terraform Cloud. Hence the sensitive values are Blank
I do not want to add dozens of variables in the TF Cloud Variable set...only certain sensitive ones.
Is this possible?
Thanks in advance

I am getting Error: Cycle when running a terraform plan

I was getting the following error while running terraform plan:
Error: Cycle: aws_sagemaker_notebook_instance.mlops_datapipeline_notebookinstance_main, aws_sagemaker_notebook_instance.mlops_datapipeline_notebookinstance_demo, data.aws_iam_policy_document.sagemaker_neptune-access, aws_iam_policy.sagemaker_execution_policy, aws_neptune_cluster.neptune_for_demo, aws_neptune_cluster.neptune_for_main, data.aws_iam_policy_document.neptune-access, aws_iam_policy.neptune_access_policy, aws_iam_role.Neptune_execution_role
I assume you are using AWS because your filename contains "ec2", even though you don't show enough code in your question or provide enough details.
The AWS Terraform provider expects tags to be a map, not a single string. You have enclosed the entire thing in double quotes, converting it into a string. Try this:
tags = merge(var.tags, map({"Name", format("%s-%d", var.name, count.index+1)}))

Is there any way for external data sources to find if this is a plan, apply or destro operation

I have a Python program that is being called by an "External Data Source" explained here:
https://registry.terraform.io/providers/hashicorp/external/latest/docs/data-sources/data_source
The code looks like something like this:
data "external" "example" {
program = ["python", "${path.module}/example-data-source.py"]
query = {
# arbitrary map from strings to strings, passed
# to the external program as the data query.
id = "abc123"
}
}
Is there technique or workaround for my external Python program to know which terraform operation (plan, apply or destroy) is calling the external data?
Is there any way for a hcl code to find what operation is it being called under?
Update #1
I understand that we can pass parameters as query argument.
The question is that is there any way a Terraform script know which Terraform operation type is it being called under? so that I have the operation (Apply/Destroy) in a terraform variable then pass it as query argument to the Python program?
The external data source isn't able to infer what the Terraform operation type is as it is only sent the data in the query argument, as mentioned and shown in the data source's documentation.
As an example, if we look at the example in the documentation:
#!/bin/bash
# Exit if any of the intermediate steps fail
set -e
# Extract "foo" and "baz" arguments from the input into
# FOO and BAZ shell variables.
# jq will ensure that the values are properly quoted
# and escaped for consumption by the shell.
eval "$(jq -r '#sh "FOO=\(.foo) BAZ=\(.baz)"')"
# Placeholder for whatever data-fetching logic your script implements
FOOBAZ="$FOO $BAZ"
# Safely produce a JSON object containing the result value.
# jq will ensure that the value is properly quoted
# and escaped to produce a valid JSON string.
jq -n --arg foobaz "$FOOBAZ" '{"foobaz":$foobaz}'
We can see that it's reading the foo and baz keys from whatever is sent to the program via stdin.
If we ran this in the most simple way we can see that it does as we expect:
data "external" "example" {
program = ["sh", "${path.module}/example-data-source.sh"]
query = {
foo = "abc1234"
baz = "123abc"
}
}
output "example" {
value = data.external.example
}
Gives the following output:
Outputs:
example = {
"id" = "-"
"program" = tolist([
"sh",
"./example-data-source.sh",
])
"query" = tomap({
"baz" = "123abc"
"foo" = "abc1234"
})
"result" = tomap({
"foobaz" = "abc1234 123abc"
})
"working_dir" = tostring(null)
}
Just to check that there's nothing sneaky there that Terraform communicates over stdin we can modify the script to log out to a file what is passed on stdin (note that this will break the example script so that jq no longer sees the input from Terraform):
#!/bin/bash
# Exit if any of the intermediate steps fail
set -e
# Log stdin inputs to an output file so we only output the formatted
# JSON on stdout for the external data source to work correctly.
cat - >> log.out
# Extract "foo" and "baz" arguments from the input into
# FOO and BAZ shell variables.
# jq will ensure that the values are properly quoted
# and escaped for consumption by the shell.
eval "$(jq -r '#sh "FOO=\(.foo) BAZ=\(.baz)"')"
# Placeholder for whatever data-fetching logic your script implements
FOOBAZ="$FOO $BAZ"
# Safely produce a JSON object containing the result value.
# jq will ensure that the value is properly quoted
# and escaped to produce a valid JSON string.
jq -n --arg foobaz "$FOOBAZ" '{"foobaz":$foobaz}'
And then if we run a plan we can see just the expected JSON in the log.out file:
{"baz":"123abc","foo":"abc1234"}
If you could access the operation type in your Terraform code instead then you could pass that in as a parameter to your external data source but that's also not exposed anywhere that you can access unfortunately.
It's also worth mentioning that as a data source, it's expected to be side effect free to run the script. This means it shouldn't really matter what operation Terraform is running when it invokes your data source. If you need to do something else that isn't side effect free then you might need to consider another option such as a provisioner, potentially attached to a null_resource but all of these options are meant to be last resorts where you need an escape hatch. Most of the time it's better to either raise a feature request in the appropriate provider or to call these things separately and orchestrate them together via a wrapper script or some orchestration software that calls them in succession.
Haven't tested this, but you should be able to get terraform's pid with os.getppid(). Then you can get more information with psutil (available in pip), /proc (linux only) or subprocessing a ps command. Finding the verb on the command line may be brittle, but depending on context, it can probably work.
That said, if you need to do this, you should reconsider how your code is organized. A pure data source shouldn't care.

How can assign the value of a sensitive output variable to an environment variable?

I am automating my terraform script in a GitHub Workflow
In my terraform script, I have a sensitive output variable like this:
output "db_password" {
value = aws_db_instance.db.password
description = "The password for logging in to the database."
sensitive = true
}
I am deploying (terraform apply) the script in a GitHub action workflow.
After a successful deployment, I need to store the password in a secured storage (Azure KeyVault) . I have a bash command to do that.
I need to have the value of the db_password in an environment variable.
How can I assign the value of a sensitive output variable to an environment variable?
Is there a better way of doing this?
I suggest to use terraform output after terraform apply. And then you can store the output to a Bash/shell variable or a file without it being printed out.
e.g.
terraform apply # as before
MY_SECRET=$(terraform output db_password)
azureInterface keyvault store $MYSECRET # a totally made-up line, no clue about Azure
The drawback is that it might:
show up in the console output for the last command
is visible in ps as command line argument
So a revised solution is to store in a temporary file
CREDENTIALS=$(mktemp -t tmp.XXXXXXXXXX)
terraform output db_password >$CREDENTIALS
# and now use the $CREDENTIALS file as input to Azure
rm -rf $CREDENTIALS

Get specific value out of the Terraform state file

I've deployed my infra using Terraform and I noticed that I have some interesting information in the state (terraform.tfstate) file of terraform which I would like to extract. For example
$ terraform state show 'packet_device.worker'
id = 6015bg2b-b8c4-4925-aad2-f0671d5d3b13
billing_cycle = hourly
created = 2015-12-17T00:06:56Z
facility = ewr1
...
which I would like to transform somehow to
$ terraform state show 'packet_device.worker.id'
6015bg2b-b8c4-4925-aad2-f0671d5d3b13
But adding the id at the end doesn't seem to work. Any suggestions how I can achieve this behaviour?
Terraform state show command is used to retrieve all the attributes of a given resource and you won't be able to fetch a single attribute from it as the argument is resource ADDRESS and is used to refer a resource specifically. Documented in https://www.terraform.io/docs/internals/resource-addressing.html
What you can do is store the resource attribute in output value and use the command
terraform output {output-valaue-extractor}
Refer: https://www.terraform.io/docs/configuration/outputs.html
You can utilize terraform show -json and jq to get a specific value out of a Terraform state file.
terraform show -json <state_file> | jq '.values.root_module.resources[] | select(.address=="<terraform_resource_name>") | .values.<property_name>'
You have a state file named terraform.tfstate and a Terraform resource as packet_device.worker and you want to get id. Then it would be as follows:
terraform show -json terraform.tfstate | jq '.values.root_module.resources[] | select(.address=="packet_device.worker") | .values.id'
terraform.tfstate also can be omitted since it is a default name for a state file.
The primary way to export information from a Terraform configuration is to declare Output Values in your root module. You can then access them using terraform output once the apply has completed. If you need that information in a machine-readable way, you can alternatively run terraform output -json from the consuming program and parse the output as JSON.
If you are in an unusual situation where you need programmatic access to all values in the state (for example, if you were implementing some sort of generic Terraform state visualization tool) then you can instead use terraform show -json, which will print out all of the data from the state in a JSON format.
If you are accessing only specific values, perhaps to integrate with some other system in an automation solution, I'd recommend using explicit Output Values because then it's explicit to future maintainers what the interface with the caller is, and so they are less likely to accidentally break the caller by e.g. refactoring the packet_device.worker resource into a child module, which would cause it to appear in a different place in the state. The usual assumption is that the resources inside a module are an implementation detail of that module and thus that you can safely refactor them as needed as long as the output values remain unchanged.
If you want to get the exact value and are willing to install jq, the other answers here are great!
If you're looking for a quick answer to manually copy/paste, etc., piping to grep does the trick.
ex:
terraform state show 'packet_device.worker' | grep "id"
which would show the relevant line(s), like:
id = 6015bg2b-b8c4-4925-aad2-f0671d5d3b13

Resources