Terraform Data Source Meaning - terraform

I am new to Terraform and trying to understand data sources. I have read the documentation and this StackOverflow post, but I'm still unclear about the use cases of data source.
I have the following block of code:
resource "azurerm_resource_group" "rg" {
name = "example-resource-group"
location = "West US 2"
}
data "azurerm_resource_group" "test" {
name = "example-resource-group"
}
But I get a 404 error:
data.azurerm_resource_group.test: data.azurerm_resource_group.test: resources.GroupsClient#Get: Failure responding to request:
StatusCode=404 -- Original Error: autorest/azure: Service returned an
error. Status=404 Code="ResourceGroupNotFound" Message="Resource group
'example-resource-group' could not be found."
I don't understand why the resource group is not found. Also, I am unclear about the difference between data and variable and when should I use which.
Thanks

I have provided a detailed explanation of what a data source is in this SO answer. To summarize:
Data sources provide dynamic information about entities that are not managed by the current Terraform configuration
Variables provide static information
Your block of code doesn't work because the resource your data source is referencing hasn't been created yet. During the planning phase, Terraform will try to find a resource group named example-resource-group, but it won't find it, and so it aborts the whole run. The ordering of the blocks makes no difference to the order they are applied.
If you remove the data block, run terraform apply, and then add the data block back in, it should work. However, data sources are used to retrieve data about entities that are not managed by your Terraform configuration. In your case, you don't need the data.azurerm_resource_group.test data source, you can simply use the exported attributes from the resource. In the case of azurerm_resource_group, this is a single id attribute.

Think of a data source as a value you want to read from somewhere else.
A variable is something you define when you run the code.
When you use the data source for azurerm_resource_group terraform will search for an existing resource that has the name you defined in your data source block.
Example
data "azurerm_resource_group" "test" {
name = "example-resource-group"
}
Quoting #ydaetskcoR from the comment below about 404 error:
It's 404ing because the data source is running before the resource
creates the thing you are looking for. You would use a data source
when the resource has already been created previously, not in the same
run as the resource you are creating.

Related

Shall TF Provider delete resources from state if the resource is in "DELETING" state (similarly to 404)?

Context: I'm creating a new TF provider.
TF official docs say that
When you create something in Terraform but delete it manually, Terraform should gracefully handle it. If the API returns an error when the resource doesn't exist, the read function should check to see if the resource is available first. If the resource isn't available, the function should set the ID to an empty string so Terraform "destroys" the resource in state. The following code snippet is an example of how this can be implemented; you do not need to add this to your configuration for this tutorial.
if resourceDoesntExist {
d.SetID("")
return
}
It's pretty clear when resourceDoesntExist := response.code == 404 but what about the case where the resource is in DELETING state (which means that the resource is going to be removed in like 30 minutes and at that point GET request will start returning 404).
Shall it be treated as 404 too? What about the corresponding data source, shall it return an error?

How to put Dashboards in the right folder dynamically using the Terraform Grafana provider

I have the following use-case: I'm using a combination of the Azure DevOps pipelines and Terraform to synchronize our TAP for Grafana (v7.4). Intention is that we can tweak and tune our dashboards on Test, and push the changes to Acceptance (and Production) via the pipelines.
I've got one pipeline that pulls in the state of the Test environment and writes it to a set of json files (for the dashboards) and a single json array (for the folders).
The second pipeline should use these resources to synchronize the Acceptance environment.
This works flawlessly for the dashboards, but I'm hitting a snag putting the dashboards in the right folder dynamically. Here's my latest working code:
resource "grafana_folder" "folders" {
for_each = toset(var.grafana_folders)
title = each.key
}
resource "grafana_dashboard" "dashboards" {
for_each = fileset(path.module, "../dashboards/*.json")
config_json = file("${path.module}/${each.key}")
}
The folder resources pushes the folders based on a variable list of names that I pass via variables. This generates the folders correctly.
The dashboard resource pushes the dashboards correctly, based on all dashboard files in the specified folder.
But now I'd like to make sure the dashboards end up in the right folder. The provider specifies that I need to do this based on the folder UID, which is generated when the folder is created. So I'd like to take the output from the grafana_folder resource and use it in the grafana_dashboard resource. I'm trying the following:
resource "grafana_folder" "folders" {
for_each = toset(var.grafana_folders)
title = each.key
}
resource "grafana_dashboard" "dashboards" {
for_each = fileset(path.module, "../dashboards/*.json")
config_json = file("${path.module}/${each.key}")
folder = lookup(transpose(grafana_folder.folders), "Station_Details", "Station_Details")
depends_on = [grafana_folder.folders]
}
If I read the Grafana Provider github correctly, the grafana_folder resource should output a map of [uid, title]. So I figured if I transpose that map, and (by way of test) lookup a folder title that I know exists, I can test the concept.
This gives the following error:
on main.tf line 38, in resource "grafana_dashboard" "dashboards":
38: folder = lookup(transpose(grafana_folder.folders),
"Station_Details", "Station_Details")
Invalid value for "default" parameter: the default value must have the
same type as the map elements.
Both Uid and Title should be strings, so I'm obviously overlooking something.
Does anyone have an inkling where I'm going wrong and/or have suggestions on how I can do this (better)?
I think the problem this error is trying to report is that grafana_folder.folders is a map of objects, and so passing it to transpose doesn't really make sense but seems to be succeeding because Terraform has found some clever way to do automatic type conversions to produce some result, but then that result (due to the signature of transpose) is a map of lists rather than a map of strings, and so "Station_Details" (a string, rather than a list) isn't a valid fallback value for that lookup.
My limited familiarity with folders in Grafana leaves me unsure as to what to suggest instead, but I expect the final expression will look something like the following:
folder = grafana_folder.folders[SOMETHING].id
SOMETHING here will be an expression that allows you to know for a given dashboard which folder key it ought to belong to. I'm not seeing an answer to that from what you shared in your question, but just as a placeholder to make this a complete answer I'll suggest that one option would be to make a local map from dashboard filename to folder name:
locals {
# a local value probably isn't actually the right answer
# here, but I'm just showing it as a placeholder for one
# possible way to map from dashboard filename to folder
# name. These names should all be elements of
# var.grafana_folders in order for this to work.
dashboard_folders = {
"example1.json" = "example-folder"
"example2.json" = "example-folder"
"example3.json" = "another-folder"
}
}
resource "grafana_dashboard" "dashboards" {
for_each = fileset("${path.module}/dashboards", "*.json")
config_json = file("${path.module}/dashboards/${each.key}")
folder = grafana_folder.folders[local.dashboard_folders[each.key]].id
}

terraform 0.13.5 resources overwrite each other on consecutive calls

I am using terraform 0.13.5 to create aws_iam resources
I have 2 terraform resources as follows
module "calls_aws_iam_policy_attachment" {
# This calls an external module to
# which among other things creates a policy attachment
# resource attaching the roles to the policy
source = ""
name = "xoyo"
roles = ["rolex", "roley"]
policy_arn = "POLICY_NAME"
}
resource "aws_iam_policy_attachment" "policies_attached" {
# This creates a policy attachment resource attaching the roles to the policy
# The roles here are a superset of the roles in the above module
roles = ["role1", "role2", "rolex", "roley"]
policy_arn = "POLICY_NAME"
name = "NAME"
# I was hoping that adding the depends on block here would mean this
# resource is always created after the above module
depends_on = [ module.calls_aws_iam_policy_attachment ]
}
The first module creates a policy and attaches some roles. I cannot edit this module
The second resource attaches more roles to the same policy along with other policies
the second resource depends_on the first resource, so I would expect that the policy attachements of the second resource always overwrite those of the first resource
In reality, the policy attachments in each resource overwrite each other on each consecutive build. So that on the first build, the second resources attachments are applied and on the second build the first resources attachements are applied and so on and so forth.
Can someone tell me why this is happening? Does depends_on not work for resources that overwrite each other?
Is there an easy fix without combining both my resources together into the same resource?
As to why this is happening:
during the first run terraform deploys the first resources, then the second ones - this order is due to the depends_on relation (the next steps work regardless of any depends_on). The second ones overwrite the first ones
during the second deploy terraform looks at what needs to be done:
the first ones are missing (were overwritten), they need to be created
the second ones are fine, terraform ignores them for this update
now only the first ones will be created and they will overwrite the second ones
during the third run the same happens but the exact other way around, seconds are missing, first are ignored, second overwrite first
repeat as often as you want, you will never end up with a stable deployment.
Solution: do not specify conflicting things in terraform. Terraform is supposed to be a description of what the infrastructure should look like - and saying "this resource should only have property A" and "this resource should only have property B" is contradictory, terraform will not be able to handle this gracefully.
What you should do specifically: do not use aws_iam_policy_attachment, basically ever, look at the big red box in the docs. Use multiple aws_iam_role_policy_attachment instead, they are additive, they will not overwrite each other.

Terraform: What's the point using Both Data Source and Resource on the same type?

I'm new to Terraform, and I'm working on a project to use Docker/AWS ECR/ECS infrastructure on AWS. I see in this post where the author specify something like
data "aws_ecs_task_definition" "test" {
task_definition = "${aws_ecs_task_definition.test.family}"
depends_on = ["aws_ecs_task_definition.test"]
}
resource "aws_ecs_task_definition" "test" {
family = "test-family"
# ...
}
why is he using both data source AND resource on aws_ecs_task_definition? I can't find an explanation or similar example after hours of digging into the official doc as well as googling articles.
I see later on when he's setting up the service, he uses the following code to reference both of them: (again, I'm not sure what's going on here)
task_definition = "${aws_ecs_task_definition.test.family}:${max("${aws_ecs_task_definition.test.revision}", "${data.aws_ecs_task_definition.test.revision}")}"
I am now confused of what is the difference between using both data & resource on the same type, versus just using resource. Is there any difference in terms of lifecycle?
I'm now trying to create a AWS ECR for my docker image and I want terraform to manage it (create/update/destroy), should I use both data source & resource for the type aws_ecr_repository as well?
It makes sense. The guy is using the data source to get the latest task definition revision. This is because he might be using some other tool(jenkins/circleci) to push changes to the task definition or revision.
Hence, if he will run that code again then terraform should pick up the latest version and update the ecs service accordingly.
Check the below code:
resource "aws_ecs_service" "test-ecs-service" {
name = "test-vz-service"
cluster = "${aws_ecs_cluster.test-ecs-cluster.id}"
task_definition = "${aws_ecs_task_definition.test.family}:${max("${aws_ecs_task_definition.test.revision}", "${data.aws_ecs_task_definition.test.revision}")}"
desired_count = 1
iam_role = "${aws_iam_role.ecs-service-role.name}"
load_balancer {
target_group_arn = "${aws_alb_target_group.test.id}"
container_name = "nginx"
container_port = "80"
}
He is updating the service with the latest revision. He is using MAX function which is returning the maximum value. You may check terraform interpolation syntax, here.
if the task definition does not exist, will this terraform script create it?
Yes, It will create it with respect to the task definition which it has in it state file. If you have created task definition manually then it will increment the revision number.
if task definition exists and the data source block retrieved it, will the resource block re-create another revised task definition, or will it just do nothing?
If there is a change in any of the configuration of the resource then it will create new task definition and that task definition will be allocated to ecs service resource but if there is no change in the resource then it will do nothing.
I'm also unclear if this terraform script is intended to run only once (initial infra creation) or upon change?
This should be run at the time of infra creation or if you wanted to do any other update to task definition resource.

Terraform: Undefined remote state handling

I have a remote state attribute called subnets which is stored in: data.terraform_remote_state.alb.subnets
Depending on what I'm deploying, this attribute either exists or doesn't exist.
When I try to create an ECS cluster, it requires an input of the subnet groups in which I would like to either use:
data.terraform_remote_state.alb.subnets
or
var.vpc_subnets (the subnets of the VPC)
Unfortunately, because of the way the interpolation works, it needed to be hacked together:
"${split(",", length(var.vpc_subnets) == 0 ? join(",",data.terraform_remote_state.alb.subnets) : join(",",var.vpc_subnets))}"
(Refering to: https://github.com/hashicorp/terraform/issues/12453)
However, because Terraform does not seem to 'lazily' evaluate ternary operators, it throws me the following error even if var.vpc_subnets is NOT zero:
Resource 'data.terraform_remote_state.alb' does not have attribute 'subnets' for variable 'data.terraform_remote_state.alb.subnets'
How can I properly handle remote state resources that could be undefined?
EDIT: Typo: Subnet->Subnets
Managed to figure it out.
When using Terraform Remote State, you have the ability to set a default: https://www.terraform.io/docs/providers/terraform/d/remote_state.html
This works in my situation when my data "terraform_remote_state.alb.subnets does not return a value. I can preset the variable to be "" and use locals to do a check for this variable.
Will it be subnet or subnets?
Suppose you have below data source:
data "terraform_remote_state" "alb" {
backend = "s3"
config {
name = "alb"
}
}
You need check the remote state attribute have any outputs with name subnet or not. Or the key name is subnets, you need confirm by yourself.

Resources