Am in the process of trying to configure IOPS alerting on EBS volumes as we move them to GP3. The plan is to configure the alarms in TF but to shift the setting of the target to a lambda that can keep the alarm up-to-date based on lifecycle changes to the ASG. For GP2 volumes I was able to get this configured cleanly and have ignore_changes on the dimensions block of each alert but now that I have moved to several metric_query blocks I cannot seem to find a way to address the nested dimension config.
resource "aws_cloudwatch_metric_alarm" "foobar" {
count = length(data.aws_availability_zones.available.names)
alarm_name = "${local.env_short}_app_volume_IOPS_${data.aws_availability_zones.available.names[count.index]}"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = "5"
threshold = "2700"
alarm_description = "IOPS in breach of 90% of provisioned"
insufficient_data_actions = []
actions_enabled = "true"
datapoints_to_alarm = "5"
alarm_actions = [aws_sns_topic.app_alert.arn]
ok_actions = [aws_sns_topic.app_alert.arn]
metric_query {
id = "e1"
expression = "(m1+m2)/PERIOD(m1)"
label = "IOPSCalc"
return_data = "true"
}
metric_query {
id = "m1"
metric {
metric_name = "VolumeWriteOps"
namespace = "AWS/EBS"
period = "60"
stat = "Sum"
dimensions = {}
}
}
metric_query {
id = "m2"
metric {
metric_name = "VolumeReadOps"
namespace = "AWS/EBS"
period = "60"
stat = "Sum"
dimensions = {}
}
}
lifecycle {
ignore_changes = [metric_query.1.metric.dimensions]
}
}
I have tried various iterations of the ignore_changes block and so far have only succeeded if I set the value to [metric_query] but that then ignores the whole thing whereas I am trying just to target the metric_query.metric.dimensions piece. Anyone have any clever ideas around addressing this block?
Related
I'm using the AWS provider for creating CloudWatch Metric Alarms. I created a module which takes in a variable that is a list of instance IDs, and the resource it has uses the "count" functionality to create an alarm per an Instance ID from that variable.
The "aws_cloudwatch_metric_alarm" resource can take in multiple "metric_query" blocks, and my plan was to do this as dynamic block to be able to define as many as needed in the root module.
Issue I'm experiencing is with accessing the "for_each" iterator values.
The high-level end solution should be something among these lines: Use 3 metric blocks, two are available metrics and a third one for an expression on top of the other two, and create this alarm for every instance that is provided in the instance list.
Resource definition, module code:
resource "aws_cloudwatch_metric_alarm" "alarm" {
count = length(var.dimension_values)
alarm_name = "${var.alarm_name}_${var.dimension_values[count.index]}"
comparison_operator = var.comparison_operator
evaluation_periods = var.evaluation_periods
threshold = var.threshold
actions_enabled = var.actions_enabled
alarm_actions = var.alarm_actions
dynamic "metric_query" {
for_each = var.metric_queries
content {
id = metric_queries.value.id
return_data = metric_queries.value.return_data
expression = metric_queries.value.expression
label = metric_queries.value.label
metric {
namespace = metric_queries.value.namespace
metric_name = metric_queries.value.metric_name
period = metric_queries.value.period
stat = metric_queries.value.stat
dimensions = {
"${metric_queries.value.dimension_name}" = var.dimension_values[count.index]
}
}
}
}
tags = merge(
var.common_tags,
{
Name = "${var.alarm_name}_${var.dimension_values[count.index]}"
}
)
}
Module variables (only metric_queries pasted):
variable "metric_queries" {
type = list(object({
id = string
return_data = bool
expression = string
label = string
namespace = string
metric_name = string
period = number
stat = string
dimension_name = string
}))
description = "Metric query for the CloudWatch alarm"
default = []
}
And finally, the root module:
module "cpu_alarms" {
source = "../../Modules/cloudwatch_metric_alarm/"
common_tags = local.common_tags
# Metrics
alarm_name = "EC2_CPU_80_PERCENT"
comparison_operator = "GreaterThanOrEqualToThreshold"
evaluation_periods = 3
threshold = 80
actions_enabled = true
alarm_actions = ["redacted"]
dimension_values = local.all_ec2_instance_ids
metric_queries = [
{
id = "m1"
return_data = true
expression = null
label = "CPU utilization"
namespace = "AWS/EC2"
metric_name = "CPUUtilization"
period = 60
stat = "Average"
dimension_name = "InstanceId"
}
]
}
I'm getting two separate errors with this approach depending on how I'm referring to the "for_each" iterator object.
When using "each" as reference to the iterator the error is:
A reference to "each.value" has been used in a context in which it unavailable, such as when the configuration no longer contains the value in its "for_each" expression. Remove this reference to each.value in your configuration to work │ around this error.
When using "metric_queries" as reference to the iterator the error is:
A managed resource "metric_queries" "value" has not been declared in module.cpu_alarms.
What could be the root cause of this?
Please see the documentation on dynamic blocks. You are trying to use the syntax for the resource level for_each meta-argument, not the syntax for dynamic blocks. It's confusing that they have different syntax, but since a dynamic block could exist inside a resource with for_each, the syntax has to be different to prevent name clashes.
For dynamic blocks the name of the variable is what you put after the dynamic key word, in your case "metric_query". So your code should look like this:
dynamic "metric_query" {
for_each = var.metric_queries
content {
id = metric_query.value.id
return_data = metric_query.value.return_data
expression = metric_query.value.expression
label = metric_query.value.label
metric {
namespace = metric_query.value.namespace
metric_name = metric_query.value.metric_name
period = metric_query.value.period
stat = metric_query.value.stat
dimensions = {
"${metric_query.value.dimension_name}" = var.dimension_values[count.index]
}
}
}
}
I am adding autoscale settings in the Azure cosmosdb database, My problem is not all our db requires autoscale only a selection of database require autoscalse rest are manual. I will not be able to specify the autoscalse block also the throughout in the same resource as there are conflicts between those two. so I thought of using the count but I will be not be able to run the resouece block for only one of the DB. for the below example
variable
variable "databases" {
description = "The list of Cosmos DB SQL Databases."
type = list(object({
name = string
throughput = number
autoscale = bool
max_throughput = number
}))
default = [
{
name = "testcoll1"
throughput = 400
autoscale = false
max_throughput = 0
},
{
name = "testcoll2"
throughput = 400
autoscale = true
max_throughput = 1000
}
]
}
For the first I dont need autoscale and next one I need. My main.tf code
resource "azurerm_cosmosdb_mongo_database" "database_manual" {
count = length(var.databases)
name = var.databases[count.index].name
resource_group_name = azurerm_cosmosdb_account.cosmosdb.resource_group_name
account_name = local.account_name
throughput = var.databases[count.index].throughput
}
resource "azurerm_cosmosdb_mongo_database" "database_autoscale" {
count = length(var.databases)
name = var.databases[count.index].name
resource_group_name = azurerm_cosmosdb_account.cosmosdb.resource_group_name
account_name = local.account_name
autoscale_settings {
max_throughput = var.databases[count.index].max_throughput
}
}
First I thought of running two blocks one with scale and on without, but I will not be able to proceed because it requires the count numbers
count = var.autoscale_required == true ? len(databases) : 0
at the start but in my case I will only know at the time of iteration. I have tried to use dynamic within the block but errored out.
*Update
I have switched to foreach and able to run the condition but still it requires 2 blocks
resource "azurerm_cosmosdb_mongo_database" "database_autoscale"
resource "azurerm_cosmosdb_mongo_database" "database_manual"
resource "azurerm_cosmosdb_mongo_database" "database_autoscale" {
for_each = {
for key, value in var.databases : key => value
if value.autoscale_required == true }
name = each.value.name
resource_group_name = azurerm_cosmosdb_account.cosmosdb.resource_group_name
account_name = local.account_name
autoscale_settings {
max_throughput = each.value.max_throughput
}
}
If I understand correctly, I think you could do what you want using the following:
resource "azurerm_cosmosdb_mongo_database" "database_autoscale" {
count = length(var.databases)
name = var.databases[count.index].name
resource_group_name = azurerm_cosmosdb_account.cosmosdb.resource_group_name
account_name = local.account_name
throughput = var.databases[count.index].autoscale == false ? var.databases[count.index].throughput : null
dynamic "autoscale_settings" {
for_each = var.databases[count.index].autoscale == false ? [] : [1]
content {
max_throughput = var.databases[count.index].max_throughput
}
}
}
Like below is the example of target tracking ASG policy in TF docs :
resource "aws_autoscaling_policy" "example" {
# ... other configuration ...
target_tracking_configuration {
predefined_metric_specification {
predefined_metric_type = "ASGAverageCPUUtilization"
}
target_value = 40.0
}
target_tracking_configuration {
customized_metric_specification {
metric_dimension {
name = "fuga"
value = "fuga"
}
metric_name = "hoge"
namespace = "hoge"
statistic = "Average"
}
target_value = 40.0
}
}
I want to create a step scaling policy which also has definition of custom metric like in this block and i am using the below code but getting error saying THIS BLOCK DOES NOT EXIST.
resource "aws_autoscaling_policy" "contentworker_inbound_step_scaling_policy" {
name = "${var.host_group}-${var.stack}-step-scaling-policy"
policy_type = "StepScaling"
autoscaling_group_name = aws_autoscaling_group.contentworker_inbound_asg.name
estimated_instance_warmup = 300
step_configuration {
customized_metric_specification {
metric_dimension {
name = "test"
value = “Size”
}
metric_name = "anything"
namespace = "test"
statistic = "Average"
unit = "None"
}
step_adjustment {
adjustment_type = "PercentChangeInCapacity"
scaling_adjustment = 10
metric_interval_lower_bound = 10
metric_interval_upper_bound = 25
}
}
}
I have the custom metric working fine with the target tracking policy, but not with the step scaling.
Any suggestions how can I setup a step scaling policy for my custom metric
How to use local value in variables.tf?
I need to assign dynamically value to threshold for two of netapp volume metric alert and I get an error: Error: Variables not allowed. Each NetApp Volume has different storage quota in GB, that's why it needs to be dynamic.
NetApp Volume code:
main.tf
locals {
iops_80 = format("%.0f", (var.storage_quota_in_gb * 1.6))
}
resource "azurerm_netapp_volume" "netapp_volume" {
name = var.netapp_vol_name
resource_group_name = var.resource_group_name
location = var.location
account_name = var.account_name
pool_name = var.pool_name
volume_path = var.volume_path
service_level = var.service_level
subnet_id = var.subnet_id
storage_quota_in_gb = var.storage_quota_in_gb
protocols = var.protocols
dynamic "export_policy_rule" {
for_each = var.export_policy_rules
content {
rule_index = export_policy_rule.value.rule_index
allowed_clients = export_policy_rule.value.allowed_clients
protocols_enabled = export_policy_rule.value.protocols_enabled
unix_read_only = export_policy_rule.value.unix_read_only
unix_read_write = export_policy_rule.value.unix_read_write
}
}
tags = var.tags
}
resource "azurerm_monitor_metric_alert" "alert" {
depends_on = [azurerm_netapp_volume.netapp_volume]
count = length(var.criteria)
name = "HPG-ALRT-${var.netapp_vol_name}-001-${element(keys(var.criteria), count.index)}"
resource_group_name = var.resource_group_name
scopes = [azurerm_netapp_volume.netapp_volume.id]
enabled = var.enabled
auto_mitigate = var.auto_mitigate
description = lookup(var.criteria, element(keys(var.criteria), count.index), null)["description"]
frequency = var.frequency
severity = lookup(var.criteria, element(keys(var.criteria), count.index), null)["severity"]
window_size = var.window_size
criteria {
metric_namespace = lookup(var.criteria, element(keys(var.criteria), count.index), null)["metric_namespace"]
metric_name = lookup(var.criteria, element(keys(var.criteria), count.index), null)["metric_name"]
aggregation = lookup(var.criteria, element(keys(var.criteria), count.index), null)["aggregation"]
operator = lookup(var.criteria, element(keys(var.criteria), count.index), null)["operator"]
threshold = lookup(var.criteria, element(keys(var.criteria), count.index), null)["threshold"]
}
action {
action_group_id = var.action_group_id
}
}
variables.tf
variable "criteria" {
type = map
default = {
"ReadLATENCY5" = {
metric_namespace = "Microsoft.NetApp/netAppAccounts/capacityPools/volumes"
metric_name = "AverageReadLatency"
aggregation = "Average"
operator = "GreaterThan"
threshold = 5
description = "NetApp: Volume Read Latency over 5ms"
severity = 2
},
"ReadIOPS80" = {
metric_namespace = "Microsoft.NetApp/netAppAccounts/capacityPools/volumes"
metric_name = "ReadIops"
aggregation = "Average"
operator = "GreaterThan"
threshold = local.iops_80
description = "NetApp: Volume Read IOPS over TBD"
severity = 2
},
"WriteIops80" = {
metric_namespace = "Microsoft.NetApp/netAppAccounts/capacityPools/volumes"
metric_name = "WriteIops"
aggregation = "Average"
operator = "GreaterThan"
threshold = local.iops_80
description = "NetApp: Volume Write IOPS over TBD"
severity = 2
},
}
}
One way is to do another criteria map to define only alerts with iops_80 value and assign it in main.tf but is there any other way to do it?
It seems you cannot use the local values in the variables file. What you can do is that use the variables in the local, and use the local value and variables in the resource block. And you can also use the local values in another local.
So I think you need to use the variables to set the input and things would be changed. And quote the variables in the local or quote the local in another local. For example, maybe you can use the local to set the criteria instead of using a variable.
I have this configuration :
variable "sub_list" {
type = "map"
default = {
"data.dev" = ["data1", "data2", "data3", "data4"]
"data.dev2" = ["data1", "data2", "data3", "data4"]
}
}
resource "random_shuffle" "az" {
input = "${var.sub_list[local.data]}"
result_count = "${length(var.VM_count)}"
}
data "vsphere_sub" "sub" {
count = "${length(var.VM_count)}"
name = "${random_shuffle.az.result[count.index]}"
}
resource "vsphere_virtual_machine" "VM" {
name = "${var.VM_name}
folder = "${var.folder}"
count = "${length(var.VM_count)}"
sub_id = "${element(data.vsphere_sub.sub.*.id, (count.index)%length(data.vsphere_sub.sub.id))}"
num_cpus = "${var.VM_vcpu}"
memory = "${var.VM_memory}"
}
When I launch with VM_count=2 for example, I expect to have a subnet for every VM but it creates the 2 VMs in the same subnet, and it shuffles just one time and not 2. How we could select randomly an item from a map based on the number of VMs to be created ?
Thank you for your help
A couple issues. You cannot get the length of a number so this
count = length(var.VM_count)
should be
count = var.VM_count
Unsure what this line's intention is but this
sub_id = element(data.vsphere_sub.sub.*.id, (count.index)%length(data.vsphere_sub.sub.id))
should be this if we want a different subnet
sub_id = element(data.vsphere_sub.sub.*.id, count.index)
so the final result would be
resource "random_shuffle" "az" {
input = "${var.sub_list[local.data]}"
result_count = "${var.VM_count}"
}
data "vsphere_sub" "sub" {
count = "${var.VM_count}"
name = "${random_shuffle.az.result[count.index]}"
}
resource "vsphere_virtual_machine" "VM" {
name = "${var.VM_name}"
folder = "${var.folder}"
count = "${var.VM_count}"
sub_id = "${element(data.vsphere_sub.sub.*.id, count.index)}"
num_cpus = "${var.VM_vcpu}"
memory = "${var.VM_memory}"
}
Now when you apply with a VM_count=2, it should grab 2 random subnets from sub_list and create 2 VMs with each having a different subnet.