GKE w/ Terraform - set autoscaling_profile - terraform

I have scale down issue on my GKE cluster and found out with the right configuration I can solve this.
As the terraform documentation I can use the arguement autoscaling_profile and set it to OPTIMIZE_UTILIZATION
Like so :
resource "google_container_cluster" "k8s_cluster" {
[...]
cluster_autoscaling {
enabled = true
autoscaling_profile = "OPTIMIZE_UTILIZATION"
resource_limits {
resource_type = "cpu"
minimum = 1
maximum = 4
}
resource_limits {
resource_type = "memory"
minimum = 4
maximum = 16
}
}
}
But I got this error :
Error: Unsupported argument on modules/gke/main.tf line 70, in resource "google_container_cluster" "k8s_cluster":
70: autoscaling_profile = "OPTIMIZE_UTILIZATION"
An argument named "autoscaling_profile" is not expected here.
I don't get it ?

TL;DR
Add below parameter to the definition of your resource (at the top):
provider = google-beta
More explanation:
autoscaling_profile as shown in the documentation is a beta feature. This means that it will need to use different provider: google-beta.
You can read more about it by following official documentation:
Terraform.io: Using the google beta provider
Focusing on most important parts from above docs:
How to use it:
To use the google-beta provider, simply set the provider field on each resource where you want to use google-beta.
resource "google_compute_instance" "beta-instance" {
provider = google-beta
# ...
}
Disclaimer about usage of google and google-beta:
If the provider field is omitted, Terraform will implicitly use the google provider by default even if you have only defined a google-beta provider block.
Adding to the whole explanation your GKE cluster definition should look like this:
resource "google_container_cluster" "k8s_cluster" {
[...]
provider = google-beta # <- HERE IT IS
cluster_autoscaling {
enabled = true
autoscaling_profile = "OPTIMIZE_UTILIZATION"
resource_limits {
resource_type = "cpu"
minimum = 1
maximum = 4
}
resource_limits {
resource_type = "memory"
minimum = 4
maximum = 16
}
}
}
You will also need to run:
$ terraform init

Related

Why am I getting 'Unsupported argument errors' in my main.tf file?

I have a main.tf file with the following code block:
module "login_service" {
source = "/path/to/module"
name = var.name
image = "python:${var.image_version}"
port = var.port
command = var.command
}
# Other stuff below
I've defined a variables.tf file as follows:
variable "name" {
type = string
default = "login-service"
description = "Name of the login service module"
}
variable "command" {
type = list(string)
default = ["python", "-m", "LoginService"]
description = "Command to run the LoginService module"
}
variable "port" {
type = number
default = 8000
description = "Port number used by the LoginService module"
}
variable "image" {
type = string
default = "python:3.10-alpine"
description = "Image used to run the LoginService module"
}
Unfortunately, I keep getting this error when running terraform plan.
Error: Unsupported argument
│
│ on main.tf line 4, in module "login_service":
│ 4: name = var.name
│
│ An argument named "name" is not expected here.
This error repeats for the other variables. I've done a bit of research and read the terraform documentation on variables, and read other stack overflow answers but I haven't really found a good answer to the problem.
Any help appreciated.
A Terraform module block is only for referring to a Terraform module. It doesn't support any other kind of module. Terraform modules are a means for reusing Terraform declarations across many configurations, but th
Therefore in order for this to be valid you must have at least one .tf file in /path/to/module that declares the variables that you are trying to pass into the module.
From what you've said it seems like there's a missing step in your design: you are trying to declare something in Kubernetes using Terraform, but the configuration you've shown here doesn't include anything which would tell Terraform to interact with Kubernetes.
A typical way to manage Kubernetes objects with Terraform is using the hashicorp/kubernetes provider. A Terraform configuration using that provider would include a declaration of the dependency on that provider, the configuration for that provider, and at least one resource block declaring something that should exist in your Kubernetes cluster:
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
}
}
}
provider "kubernetes" {
host = "https://example.com/" # URL of your Kubernetes API
# ...
}
# For example only, a kubernetes_deployment resource
# that declares one Kubernetes deployment.
# In practice you can use any resource type from this
# provider, depending on what you want to declare.
resource "kubernetes_deployment" "example" {
metadata {
name = "terraform-example"
labels = {
test = "MyExampleApp"
}
}
spec {
replicas = 3
selector {
match_labels = {
test = "MyExampleApp"
}
}
template {
metadata {
labels = {
test = "MyExampleApp"
}
}
spec {
container {
image = "nginx:1.21.6"
name = "example"
resources {
limits = {
cpu = "0.5"
memory = "512Mi"
}
requests = {
cpu = "250m"
memory = "50Mi"
}
}
liveness_probe {
http_get {
path = "/"
port = 80
http_header {
name = "X-Custom-Header"
value = "Awesome"
}
}
initial_delay_seconds = 3
period_seconds = 3
}
}
}
}
}
}
Although you can arrange resources into separate modules in Terraform if you wish, I would suggest focusing on learning to directly describe resources in Terraform first and then once you are confident with that you can learn about techniques for code reuse using Terraform modules.

Configure High Availability conditionally based on variable for `azurerm_postgresql_flexible_server`

I'm configuring my servers with terraform. For non-prod environments, our sku doesn't allow for high availability, but in prod our sku does.
For some reason high_availability.mode only accepts the value of "ZoneRedundant" for high availability, but doesn't accept any other value (according to the documentation). Depending on whether or not var.isProd is true, I want to turn high availability on and off, but how would I do that?
resource "azurerm_postgresql_flexible_server" "default" {
name = "example-${var.env}-postgresql-server"
location = azurerm_resource_group.default.location
resource_group_name = azurerm_resource_group.default.name
version = "14"
administrator_login = "sqladmin"
administrator_password = random_password.postgresql_server.result
geo_redundant_backup_enabled = var.isProd
backup_retention_days = var.isProd ? 60 : 7
storage_mb = 32768
high_availability {
mode = "ZoneRedundant"
}
sku_name = var.isProd ? "B_Standard_B2s" : "B_Standard_B1ms"
}
I believe the default assignment for this resource would be disabled HA, and therefore it is not the argument mode which manages the HA, but rather the existence of the high_availability block. Therefore, you could manage the HA by excluding the block to accept the default "disabled", or including the block to manage the HA as enabled with a value of ZoneRedundant:
dynamic "high_availability" {
for_each = var.isProd ? ["this"] : []
content {
mode = "ZoneRedundant"
}
}
I am hypothesizing somewhat on the API endpoint parameter defaults, so this would need to be acceptance tested with an Azure account. However, the documentation for the Azure Postgres Flexible Server in general claims HA is in fact disabled by default, so this should function as desired.
If you deploy flexible server using azurerm provider in terraform it will accept only ZoneRedundant value for highavailability.mode and also using this provider you can deploy sql server versions [11 12 13] only.
Using Azapi provider in terraform you can use highavailability values with any one of these "Disabled, ZoneRedundant or SameZone"
Based on your requirement I have created the below sample terraform script which has a environment variable accepts only prod or non-prod values using this value further flexible server will get deployed with respective properties.
If the environment value is prod script will deploy flexible server with high availability as zone redundant and also backup retention with 35 days with geo redundant backup enabled.
If the environment value is non-prod script will deploy flexible server with high availability as disabled and also backup retention with 7 days with geo redundant backup disabled
Here is the Terraform Script:
terraform {
required_providers {
azapi = {
source = "azure/azapi"
}
}
}
provider "azapi" {
}
variable "environment" {
type = string
validation {
condition = anytrue([var.environment == "prod",var.environment=="non-prod"])
error_message = "you havent defined any of allowed values"
}
}
resource "azapi_resource" "rg" {
type = "Microsoft.Resources/resourceGroups#2021-04-01"
name = "teststackhub"
location = "eastus"
parent_id = "/subscriptions/<subscriptionId>"
}
resource "azapi_resource" "test" {
type = "Microsoft.DBforPostgreSQL/flexibleServers#2022-01-20-preview"
name = "example-${var.environment}-postgresql-server"
location = azapi_resource.rg.location
parent_id = azapi_resource.rg.id
body = jsonencode({
properties= {
administratorLogin="azureuser"
administratorLoginPassword="<password>"
backup = {
backupRetentionDays = var.environment=="prod"?35:7
geoRedundantBackup = var.environment=="prod"?"Enabled":"Disabled"
}
storage={
storageSizeGB=32
}
highAvailability={
mode= var.environment=="prod"?"ZoneRedundant":"Disabled"
}
version = "14"
}
sku={
name = var.environment=="prod" ? "Standard_B2s" : "Standard_B1ms"
tier = "GeneralPurpose"
}
})
}
NOTE: The above terraform sample script is for your reference please do make the changes based on your business requirement.

Terraform apply making changes to imported resource in state when resource has not changed

I have the following config:
# Configure the Azure provider
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.25.0"
}
databricks = {
source = "databricks/databricks"
version = "1.4.0"
}
}
}
provider "azurerm" {
alias = "uat-sub"
features {}
subscription_id = "sfsdf"
}
provider "databricks" {
host = "https://abd-1234.azuredatabricks.net"
token = "sdflkjsdf"
alias = "dev-dbx-provider"
}
resource "databricks_cluster" "dev_cluster" {
cluster_name = "xyz"
spark_version = "10.4.x-scala2.12"
}
I am able to successfully import databricks_cluster.dev_cluster. Once imported, I update my config to output a value from the cluster in state. The updated config looks like this:
# Configure the Azure provider
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.25.0"
}
databricks = {
source = "databricks/databricks"
version = "1.4.0"
}
}
}
provider "azurerm" {
alias = "uat-sub"
features {}
subscription_id = "sfsdf"
}
provider "databricks" {
host = "https://abd-1234.azuredatabricks.net"
token = "sdflkjsdf"
alias = "dev-dbx-provider"
}
resource "databricks_cluster" "dev_cluster" {
cluster_name = "xyz"
spark_version = "10.4.x-scala2.12"
}
output "atm"{
value = databricks_cluster.dev_cluster.autotermination_minutes
}
When I run terraform apply on the updated config, terrform proceeds to refresh my imported cluster and detects changes and does an 'update-in-place' where some of the values on my cluster are set null (autoscale/pyspark_env etc). All this happens when no changes are actually being made on the cluster. Why is this happening? Why is terraform resetting some values when no changes have been made?
EDIT- 'terraform plan' output:
C:\Users\>terraform plan
databricks_cluster.dev_cluster: Refreshing state... [id=gyht]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
# databricks_cluster.dev_cluster will be updated in-place
~ resource "databricks_cluster" "dev_cluster" {
~ autotermination_minutes = 10 -> 60
- data_security_mode = "NONE" -> null
id = "gyht"
~ spark_env_vars = {
- "PYSPARK_PYTHON" = "/databricks/python3/bin/python3" -> null
}
# (13 unchanged attributes hidden)
- autoscale {
- max_workers = 8 -> null
- min_workers = 2 -> null
}
- cluster_log_conf {
- dbfs {
- destination = "dbfs:/cluster-logs" -> null
}
}
# (2 unchanged blocks hidden)
}
Plan: 0 to add, 1 to change, 0 to destroy.
EDIT - Work around with hard coded tags:
resource "databricks_cluster" "dev_cluster" {
cluster_name = "xyz"
spark_version = "10.4.x-scala2.12"
autotermination_minutes = 10
data_security_mode = "NONE"
autoscale {
max_workers = 8
min_workers = 2
}
cluster_log_conf {
dbfs {
destination = "dbfs:/cluster-logs"
}
}
spark_env_vars = {
PYSPARK_PYTHON = "/databricks/python3/bin/python3"
}
}
The workaround partially works as I no longer see terraform trying to reset the tags on every apply. But if I were to change any of the tags on the cluster, lets says I change max workers to 5, terraform will not update state to reflect 5 workers. TF will override 5 with the hard coded 8, which is an issue.
To answer your first part of your question, Terraform has imported the actual values of your cluster into the state file but it cannot import those values into your config file (.hcl) for you so you need to specify them manually (as you have done).
By not setting the optional fields, you are effectively saying "set those fields to the default value" which in most cases is null (with the exception of the autotermination_minutes field which has a default of 60), which is why Terraform detects a drift between your state and your config. (actual values from import vs. the default values of the unspecified fields).
For reference : https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster
For the second part of your question, you say
lets says I change max workers to 5, terraform will not update state to reflect 5 workers.
if you mean you change the max workers from outside of Terraform, then Terraform is designed to override that field when you run terraform apply. When working with Terraform, if you want to make a change to your infrastructure, you always want to make the changes in your Terraform config and run terraform apply to make those changes for you.
So in your case if you wanted to change the max_workers to 5, you would set that value in the terraform config and run terraform apply. You would not do it from within Databricks. If that behaviour is problematic I would question whether you want to manage that resource with Terraform, as that is always how Terraform will work.
Hope that helps!
This is regarding the max_worker tag changes, hope you have the var.tf file and if you had mentioned var "max" {default=8} in var.tf.
Then you can override this value explicitly by providing the required value while applying plan such as terraform plan -var="max=5" and you can check in the plan output.
:)

AwsBackUp supports cross region copy in terraform

Does terraform support aws backup feature for cross region copy (https://www.terraform.io/docs/providers/aws/r/backup_plan.html )?
As I read the document I can see that it does support.
But I get the following error:
Error: Unsupported argument
on backup_plan.tf line 11, in resource "aws_backup_plan" "example":
11: copy_action = {
An argument named "copy_action" is not expected here.
My terraform file for your reference
resource "aws_backup_plan" "example" {
name = "example-plan"
rule {
rule_name = "MainRule"
target_vault_name = "primary"
schedule = "cron(5 8 * * ? *)"
start_window = 480
completion_window = 10080
lifecycle {
delete_after = 30
}
copy_action {
destination_vault_arn = "arn:aws:backup:us-west-2:123456789:backup-vault:secondary"
}
}
}
But when I remove the block
copy_action {
destination_vault_arn = "arn:aws:backup:us-west-2:123456789:backup-vault:secondary"
}
It works just fine
Thanks
I assume you are running a version of the Terraform AWS Provider of 2.57.0 or older.
Version 2.58.0 (released 3 days ago) brought support for the copy_action:
resource/aws_backup_plan: Add rule configuration block copy_action configuration block (support cross region copy)
You can specify in your code to require at least this version as follows:
provider "aws" {
version = "~> 2.58.0"
}

Specify "dotted" k8s labels for node pools?

Kubernetes supports dots in metadata label keys (for example app.role), and indeed this seems to be a common convention.
The terraform configuration language (0.12) doesn't support dots in argument names, so labels of this form cannot be specified. For example in a google_container_node_pool configuration, I want to specify this:
resource "google_container_node_pool" "my-node-pool" {
...
labels = {
app.role = web
}
}
Is there a workaround?
note: slashes (/) are quite common in k8s labels as well..
UPDATE: in case anyone stumbles on this same issue down the road, I figured out the root of my issue. I had incorrectly specified the labels argument as a block by omitting the =. So it looked like this:
labels {
"app.role" = "web"
}
This yielded the following error, which pointed me in the wrong direction:
Error: Invalid argument name
on main.tf line 45, in resource "google_container_node_pool" "primary_preemptible_nodes":
45: "app.role" = "web"
Argument names must not be quoted.
I noticed and fixed the missing = but I didn't put it together that map keys have different syntax from argument names.
I verified the suggestion from #ydaetskcoR that wrapping the label in quotes works. Here is the snippet defining the node pool that I created (using Terraform v0.11.13):
resource "google_container_node_pool" "node_pool" {
cluster = "${google_container_cluster.cluster.name}"
zone = "${var.cluster_location}"
initial_node_count = "${var.node_count}"
autoscaling {
min_node_count = 1
max_node_count = 5
}
management {
auto_repair = true
auto_upgrade = true
}
node_config {
machine_type = "${var.machine_type}"
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/devstorage.read_only",
]
metadata {
disable-legacy-endpoints = "true"
}
labels = {
"app.role" = "web"
}
}
}
edit: I also verified that the same works with terraform 0.12.3.

Resources