Terraform Datadog Query Is Invalid - terraform

I'm trying to test out creating a monitor for google pub sub and am getting an "Invalid Query" error. This is the query text when i view source of another working monitor, so i'm confused as to why this isn't working.
Error: Error: error creating monitor: 400 Bad Request: {"errors":["The value provided for parameter 'query' is invalid"]}
Terraform:
resource "datadog_monitor" "bad_stuff_sub_monitor" {
name = "${var.customer_name} Bad Stuff Monitor"
type = "metric alert"
message = "${var.customer_name} Bad Stuff Topic getting too big. Notify: ${var.datadog_monitor_notify_list}"
escalation_message = "Escalation message #pagerduty"
query = "avg:gcp.pubsub.subscription.num_undelivered_messages{project_id:terraform_gcp_test}"
thresholds = {
ok = 0
warning = 1
warning_recovery = 0
critical = 2
critical_recovery = 1
}
notify_no_data = false
renotify_interval = 1440
notify_audit = false
timeout_h = 60
include_tags = true
# ignore any changes in silenced value; using silenced is deprecated in favor of downtimes
lifecycle {
ignore_changes = [silenced]
}
tags = [var.customer_name, var.project_name]
}

So I ended up just looking at the tests in the datadog terraform provider and noticing the query format they are testing.
query = "avg(last_30m):avg:gcp.pubsub.subscription.num_undelivered_messages{project_id:${var.project_name},subscription_id:{project_id:terraform_gcp_test} > 2"
It seems you need to specify a time range and also add in a comparison threshold that matches your critical alert threshold. That was what was missing.

Related

Setup Cloudwatch Alarm with Terraform that uses a query-expression

My goal is to setup an alarm in Cloudwatch via Terraform, that fires when disk_usage is above a certain treshold. The monitored metrics come from a Non-AWS-Server and are collected via CloudWatch Agent.
My first step was to do this manually, by setting up a metric that Selects the maximum disk_usage of all devices on a selected host:
SELECT MAX(disk_used_percent) FROM CWAgent WHERE host = 'MY_HOST'
I when successfully created an alarm based on this metric. Now I want to do the same thing with Terraform, but I cant figure out how to do that.
If I setup the Terraform-Resource to use a dimension for the host, then I get no results. If I try to setup a metric-query, then I get a conflict between Terraform and AWS, where Terraform tells me that my resource should not declare a "period"-Attribute but AWS demands it and will fail if not provided:
Error: Updating metric alarm failed: ValidationError: Period must not
be null
Currently, my resource looks like this:
resource "aws_cloudwatch_metric_alarm" "disk_usage_alarm" {
alarm_name = "Disk usage alarm on MY_HOST"
alarm_description = "One or more disks on MY_HOST are over 65% capacity"
comparison_operator = "GreaterThanOrEqualToThreshold"
threshold = "65"
evaluation_periods = "2"
datapoints_to_alarm = "1"
treat_missing_data = "missing"
actions_enabled = "false"
insufficient_data_actions = []
alarm_actions = []
ok_actions = []
metric_query {
id = "q1"
label = "Maximum disk_used_percentage for all disks on Host MY_HOST"
return_data = true
expression = "SELECT MAX(disk_used_percent) FROM CWAgent WHERE host = 'MY_HOST'"
}
}
Anyone knows whats wrong here and how to correctly setup this alarm via Terraform?

Is there a way to have division when writing terraform code for a log alert in Datadog?

I want have a terraform code to create a Datadog monitor for the percentage of errors in logs compared with all of them.
This is what I've tried
resource "datadog_monitor" "log_errors_count" {
count = local.memory_usage_threshold.critical \> 0 ? 1 : 0
name = "\[${module.label.id}\] ${length(var.description) \> 0 ? var.description : "Log Errors Percentage"}"
type = "log alert"
query = "logs(\"service:api-member status:error\").index(\"*\").rollup(\"count\").by(\"service\").last(\"${var.period}\") / logs(\"service:api-member\").index(\"*\").rollup(\"count\").by(\"service\").last(\"${var.period}\") \> ${local.logged_errors_threshold.critical}"
monitor_thresholds {
ok = local.logged_errors_threshold.ok
warning = local.logged_errors_threshold.warning
critical = local.logged_errors_threshold.critical
}
}
But it returns:
400 Bad Request: {"errors":["The value provided for parameter 'query' is invalid: invalid operator specified: "]}
I have done this kind of division for a metric alert and it worked fine. Using Datadog dashboard I can create a log monitor the way I want, but it looks like I am missing something when I try to do it using terraform.
Try to escape the internal quotes with a backslash
query = "logs(/"service:api-member status:error/").index(/"/").rollup/"count/").by(/"service/").last(/"${var.period}/") / logs(/"service:api-member/").index(/"*"/).rollup(/"count/").by(/"service/").last(/"${var.period}/") \> ${local.logged_errors_threshold.critical}"

Terraform - How to use conditionally created resource's output in conditional operator?

I have a case where I have to create an aws_vpc resource if the user does not provide vpc id. After that I am supposed to create resources with that VPC.
Now, I am applying conditionals while creating an aws_vpc resource. For example, only create VPC if existing_vpc is false:
count = "${var.existing_vpc ? 0 : 1}"
Next, for example, I have to create nodes in the VPC. If the existing_vpc is true, use the var.vpc_id, else use the computed VPC ID from aws_vpc resource.
But, the issue is, if existing_vpc is true, aws_vpc will not create a new resource and the ternary condition is anyways trying to check if the aws_vpc resource is being created or not. If it doesn't get created, terraform errors out.
An example of the error when using conditional operator on aws_subnet:
Resource 'aws_subnet.xyz-subnet' not found for variable 'aws_subnet.xyz-subnet.id'
The code resulting in the error is:
subnet_id = "${var.existing_vpc ? var.subnet_id : aws_subnet.xyz-subnet.id}"
If both things are dependent on each other, how can we create conditional resources and assign values to other configuration based on them?
You can access dynamically created modules and resources as follows
output "vpc_id" {
value = length(module.vpc) > 0 ? module.vpc[*].id : null
}
If count = 0, output is null
If count > 0, output is list of vpc ids
If count = 1 and you want to receive a single vpc id you can specify:
output "vpc_id" {
value = length(module.vpc) > 0 ? one(module.vpc).id : null
}
The following example shows how to optionally specify whether a resource is created (using the conditional operator), and shows how to handle returning output when a resource is not created. This happens to be done using a module, and uses an object variable's element as a flag to indicate whether the resource should be created or not.
But to specifically answer your question, you can use the conditional operator as follows:
output "module_id" {
value = var.module_config.skip == true ? null : format("%v",null_resource.null.*.id)
}
And access the output in the calling main.tf:
module "use_conditionals" {
source = "../../scratch/conditionals-modules/m2" # << Change to your directory
a = module.skipped_module.module_id # Doesn't exist, so might need to handle that.
b = module.notskipped_module.module_id
c = module.default_module.module_id
}
Full example follows. NOTE: this is using terraform v0.14.2
# root/main.tf
provider "null" {}
module "skipped_module" {
source = "../../scratch/conditionals-modules/m1" # << Change to your directory
module_config = {
skip = true # explicitly skip this module.
name = "skipped"
}
}
module "notskipped_module" {
source = "../../scratch/conditionals-modules/m1" # << Change to your directory
module_config = {
skip = false # explicitly don't skip this module.
name = "notskipped"
}
}
module "default_module" {
source = "../../scratch/conditionals-modules/m1" # << Change to your directory
# The default position is, don't skip. see m1/variables.tf
}
module "use_conditionals" {
source = "../../scratch/conditionals-modules/m2" # << Change to your directory
a = module.skipped_module.module_id
b = module.notskipped_module.module_id
c = module.default_module.module_id
}
# root/outputs.tf
output skipped_module_name_and_id {
value = module.skipped_module.module_name_and_id
}
output notskipped_module_name_and_id {
value = module.notskipped_module.module_name_and_id
}
output default_module_name_and_id {
value = module.default_module.module_name_and_id
}
the module
# m1/main.tf
resource "null_resource" "null" {
count = var.module_config.skip ? 0 : 1 # If skip == true, then don't create the resource.
provisioner "local-exec" {
command = <<EOT
#!/usr/bin/env bash
echo "null resource, var.module_config.name: ${var.module_config.name}"
EOT
}
}
# m1/variables.tf
variable "module_config" {
type = object ({
skip = bool,
name = string
})
default = {
skip = false
name = "<NAME>"
}
}
# m1/outputs.tf
output "module_name_and_id" {
value = var.module_config.skip == true ? "SKIPPED" : format(
"%s id:%v",
var.module_config.name,
null_resource.null.*.id
)
}
output "module_id" {
value = var.module_config.skip == true ? null : format("%v",null_resource.null.*.id)
}
The current answers here are helpful when you are working with more modern versions of terraform, but as noted by OP here they do not work when you are working with terraform < 0.12 (If you're like me and still dealing with these older versions, I am sorry, I feel your pain.)
See the relevant issue from the terraform project for more info on why the below is necessary with the older versions.
but to avoid link rot, I'll use the OPs example subnet_id argument using the answers in the github issue.
subnet_id = "${element(compact(concat(aws_subnet.xyz-subnet.*.id, list(var.subnet_id))),0)}"
From the inside out:
concat will join the splat output list to list(var.subnet_id) -- per the background link 'When count = 0, the "splat syntax" expands to an empty list'
compact will remove the empty item
element will return your var.subnet_id only when compact recieves the empty splat output.

What is the terraform syntax to create an AWS Route53 TXT record that has a map as JSON as payload?

My intention is to create an AWS Route53 TXT record, that contains a JSON representation of a terraform map as payload.
I would expect the following to do the trick:
variable "payload" {
type = "map"
default = {
foo = "bar"
baz = "qux"
}
}
resource "aws_route53_record" "TXT-json" {
zone_id = "${module.domain.I-zone_id}"
name = "test.${module.domain.I-fqdn}"
type = "TXT"
ttl = "${var.ttl}"
records = "${list(jsonencode(var.payload))}"
}
terraform validate and terraform plan are ok with that. terraform apply starts happily, but AWS reports an error:
* aws_route53_record.TXT-json: [ERR]: Error building changeset: InvalidChangeBatch: Invalid Resource Record: FATAL problem: InvalidCharacterString (Value should be enclosed in quotation marks) encountered with '"{"baz":"qux","foo":"bar"}"'
status code: 400, request id: 062d4536-3ad3-11e7-af24-0fbcd067fb9e
Terraform version is
Terraform v0.9.4
String handling is very difficult in HCL. I found many references surrounding this issue on the 'net, but I can't seem to find the actual solution. A solution based on the workaround noted in terraform#10048 doesn't work. "${list(substr(jsonencode(var.payload), 1, -1))}" removes the starting curly brace {, not the first quote. That seems to be added later.
Adding quotes (as the error message from AWS suggests) doesn't help; it just adds more quotes, and there already are (the AWS error message is misleading).
The message you're getting is not generated by Terraform. It is a validation error raised by Route53. You'd get the same error if you added eg. {"a":2,"foo":"bar"} as value via the AWS console.
On the other hand, escaping the JSON works ie. I was able to add "{\"a\":2,\"foo\":\"bar\"}" as a TXT value through the AWS console.
If you're OK with that, you can perform a double jsonencode, meaning that you can jsonencode the JSON string generated by jsonencode such as:
variable "payload" {
type = "map"
default = {
foo = "bar"
baz = "qux"
}
}
output "test" {
value = "${jsonencode(jsonencode(var.payload))}"
}
which resolves to:
➜ ~ terraform apply
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
test = "{\"baz\":\"qux\",\"foo\":\"bar\"}"
(you would of course have to use the aws_route53_record resource instead of output)
so basically this works:
resource "aws_route53_record" "record_txt" {
zone_id = "${data.aws_route53_zone.primary.zone_id}"
name = "${var.my_domain}"
type = "TXT"
ttl = "300"
records = ["{\\\"my_value\\\", \\\"${var.my_value}\\\"}"]
}
U're welcome.

Terraform still trying to resolve interpolation in resource with count of zero

I am trying to create PTR records for a server deployment. The server(s) below need to be deployed after a dependent set of servers are applied so we currently do this by running one apply deploying those server modules then a second deploy where we change these resources counts from 0 to however many we're looking to deploy. I've added a new resource to create a PTR record for these servers and even with the count set to 0 Terraform attempts to resolve the interpolation. It doesn't do this for the A record resource just the PTR record resouce.
Here is the code, I even hard coded the count to 0 to see if there was an issue with the variable. The list is expected to be empty while the count is at 0. I expect that Terraform wouldn't try to resolve the interpolation.
resource "aws_route53_record" "ds_sync_A_records" {
// same number of records as instances
provider = "aws.dns"
count = 0
// count = "${var.ping_sync_cluster_count}"
zone_id = "${data.aws_route53_zone.zone_company_io.zone_id}"
name = "ping-sync-0${count.index}.${var.domain_name}"
type = "A"
ttl = "10"
// matches up record N to instance N
records = ["${element(module.ping_sync_hot_server.private_server_ips, count.index)}"]
}
resource "aws_route53_record" "ds_sync_PTR_records" {
// same number of records as instances
provider = "aws.dns"
count = 0
// count = "${var.ping_sync_cluster_count}"
zone_id = "${data.aws_route53_zone.zone_company_io.zone_id}"
name = "${format(
"%s.%s.%s.$s.in-appr.arpa",
element(split(".", element(module.ping_sync_hot_server.private_server_ips, count.index)), 3),
element(split(".", element(module.ping_sync_hot_server.private_server_ips, count.index)), 2),
element(split(".", element(module.ping_sync_hot_server.private_server_ips, count.index)), 1),
element(split(".", element(module.ping_sync_hot_server.private_server_ips, count.index)), 0)
)}"
type = "PTR"
ttl = "10"
// matches up record N to instance N
records = ["${element(module.ping_sync_hot_server.private_server_ips, count.index)}"]
}
Error message on apply:
Error running plan: 3 error(s) occurred:
* element: element() may not be used with an empty list in:
${element(module.ping_sync_hot_server.private_server_ips, count.index)}
* element: element() may not be used with an empty list in:
${element(module.ping_sync_hot_server.private_server_ips, count.index)}
* element: element() may not be used with an empty list in:
${element(module.ping_sync_hot_server.private_server_ips, count.index)}
Use splat syntax (*) when returned record is a list.
records = [
"${element(module.ping_sync_hot_server.*.private_server_ips, count.index)}",
]

Resources