Terraform import failing because Kubernetes provider relies on data sources - terraform

Using Terraform, I'm creating a Kubernetes cluster, installing the nginx-ingress-controller Helm chart, and then adding a Route53 hosted zone for my domain (including a wildcard record pointing to the load balancer created by the ingress Helm chart.
To do this I use two separate Terraform files and my process should be as follows -
Use Terraform file 1 to apply a VPC, EKS cluster and node group.
Use the Helm CLI to install the nginx-ingress-controller chart (there is an additional requirement not related to this issue that means the Helm chart cannot be installed by Terraform).
Import the namespace to which the nginx-ingress-controller chart was deployed into the state for Terraform file 2
Use Terraform file 2 to apply the Route53 hosted zone and record required for ingress.
I thought this was going to work, but the Terraform import command has a severe limitation -
The only limitation Terraform has when reading the configuration files is that the import provider configurations must not depend on non-variable inputs. For example, a provider configuration cannot depend on a data source.
Because I'm using a Kubernetes provider that relies on data sources, I'm falling foul of this limitation.
data "aws_eks_cluster" "cluster" {
name = var.cluster.name
}
data "aws_eks_cluster_auth" "cluster" {
name = var.cluster.name
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
}
Taking the following in to account, is there a way of using Terraform that would work?
I need to output the value of the NS record for the Route53 hosted zone created by the apply of Terraform file 2. Because of this, I can't include those resources in Terraform file 1 as the apply would fail if an output exists for a module/resource that won't yet exist.
The namespace must be imported so that it is destroyed when destroy is run for Terraform file 2. If it is not, when destroy is run for Terraform file 1 it will fail because the VPC won't be able to delete due to the network interface and security group created by the nginx-ingress-controller Helm chart.
The token provided by the aws_eks_cluster_auth data source only lasts for 15 minutes (the aws-iam-authenticator cannot provide a longer token), so it is inappropriate to output the token from Terraform file 1 because it's likely to have expired by the time it is used by Terraform file 2.
Update
I've tried to use an exec-based credential plugin as that means no data source would be required, but this causes Terraform to fail right out of the gate. It seems that in this case Terraform tries to create a config file before module.kubernetes-cluster has been created, so the cluster doesn't exist.
This provider configuration -
provider "kubernetes" {
host = module.kubernetes-cluster.endpoint
insecure = true
exec {
api_version = "client.authentication.k8s.io/v1alpha1"
args = ["eks", "get-token", "--region", var.cluster.region, "--cluster-name", var.cluster.name]
command = "aws"
}
}
Produces this error -
╷
│ Error: Provider configuration: cannot load Kubernetes client config
│
│ with provider["registry.terraform.io/hashicorp/kubernetes"],
│ on main.tf line 73, in provider "kubernetes":
│ 73: provider "kubernetes" {
│
│ invalid configuration: default cluster has no server defined
╵

I got this error when updating my google_container_cluster in GKE with a new network configuration. To resolve it, I had to revert to the older network configuration, remove the provider "kubernetes", add the new network configuration and then add provider "kubernetes" back.

Related

Terraform resource with the ID already exists error when running Azure DevOps pipeline with Github CICD

I was following the tutorial with this video: https://www.youtube.com/watch?v=Ff0DoAmpv6w&t=5905s (Azure DevOps: Provision API Infrastructure using Terraform) T
This is his github code, and mine was very similar: https://github.com/binarythistle/S03E03---Azure-Devops-and-Terraform
The problem was when the resource group doesn't exist on azure, say I deleted it manually, running the pipeine creates it as well as my container instance. But when I execute the pipline again when I try to commit some code change and push to github, it shows
azurerm_resource_group.rg: Creating...
╷
│ Error: A resource with the ID "/subscriptions/xxxxx/resourceGroups/xxx" already exists
│
│ with azurerm_resource_group.rg,
│ on main.tf line 30, in resource "azurerm_resource_group" "rg":
│ 30: resource "azurerm_resource_group" "rg" {
Shouldn't it remember that a resource HAS been created before and skip this step - or perform some other action?
My observations
It looked like when the first time run, log shows there were some extra steps exceuted
azurerm_resource_group.rg: Refreshing state... [id=/subscriptions/xxxxx/resourceGroups/xxx]
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the
last "terraform apply" which may have affected this plan:
# azurerm_resource_group.rg has been deleted
- resource "azurerm_resource_group" "rg" {
id = "/subscriptions/xxxxx/resourceGroups/xxx"
- location = "australiaeast" -> null
- name = "myTFResourceGroup" -> null
}
Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.
The second time pipeline was ran, the log shows:
Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.
but then the error mentioned above was shown.
deleting the .terraform local folder to clean the cache, then run terraform init again and retry running the pipeline.

shares.Client#GetProperties: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded

I've deployed a new file share on a storage account I have in Azure and ever since I did that I am no longer able to perform terraform plan and instead getting the following error:
azurerm_storage_account_customer_managed_key.this[0]: Refreshing state... [id=/subscriptions/**********/resourceGroups/myrg/providers/Microsoft.Storage/storageAccounts/myaccount]
╷
│ Error: shares.Client#GetProperties: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded
│
│ with azurerm_storage_share.this["share1"],
│ on main.tf line 155, in resource "azurerm_storage_share" "this":
│ 155: resource "azurerm_storage_share" "this" {
│
╵
Destroy False detailedExitCode: 1
Error detected by Terraform
##[error]Script failed with exit code: 1
I've tried setting the storage account networking to public (Enable from all networks) and still the same.
I've tried different Terraform versions (1.2.6, 1.0.4, 1.2.7, 1.2.0) - same outcome.
I've looked it up online and came up with these two tickets that seem similar but have yet to receive an answer (though they are not from Stack Overflow):
https://github.com/hashicorp/terraform-provider-azurerm/issues/17851
https://github.com/hashicorp/terraform-provider-azurerm/issues/2977
I have run out of leads to investigate at the moment , and I'd appreciate if someone might have new ideas as to what's causing the issue.
Let me know if I can share more information.
In my case i got the similar kind of error when i have not cleared the state
file (TF) in which other resource is present which is not present in
azure portal.(As I have manually deleted it in the portal but still
present in the terraform state file.)
I erased the resources which are not present in the azure portal and then tried to execute the same.
Or If some sources are present in azure and not in terraform state file.In this case
make sure to import the resources using terraform import <terraformid> <resourceId> something like this azurerm_resource_group | Resources | registry.terraform.io or this SO ref by checking the mismatch in resources in azure portal and terraform.tfstate file.
For example Resource Groups can be imported as:
terraform import azurerm_resource_group.example /subscriptions/00000000-0000-0000-0000-000000000000/resourceGroups/example
After making sure the state file matches the resources present in
portal, then try and execute.
I tried in my environment and able to execute terraform plan and terraform apply
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "example" {
name = "xxxxxxx"
location = "westus2"
}
resource "azurerm_storage_account" "test" {
name = "acctestacc1234"
resource_group_name = azurerm_resource_group.example.name
location = "westus2"
account_tier = "Standard"
account_replication_type = "LRS"
}
resource "azurerm_storage_share" "test" {
name = "testkaaccount"
storage_account_name = azurerm_storage_account.test.name
quota = 5
access_tier = "TransactionOptimized"
}
Result:
Also please check with the location if it correctly given refering to
resource group location and if it is created in VM ,make sure the vm
and the resources created are in the same location .
References:
azurerm_storage_share_file | Resources | hashicorp/azurerm |
Terraform Registry
Terraform Azure Configure VM Backup Policy Fails - Stack Overflow
So Terraform uses private urls for management of the file share. In our cases DNS resolving of these endpoints was not working correctly. You can get the URL for the private endpoint using the command terraform console and then investigate the resource >azurerm_storage_share.file_share . It will show the private URL. Subsequently, use the nslookup or dig command to determine if you can resolve the URL to an IP address. If you are not able to resolve the URL there are several options. For example you could add them to your /etc/hosts file. This solved it our case. Another option is to add the private link to a private DNS zone and forward you local DNS to this private zone.

How to fix Provider configuration not present error in Terraform plan

I have moved Terraform configuration from one Git repo to other.
Then I ran Terraform init and it completed successfully.
When I run Terraform plan, I find below issue.
Terraform plan
╷
│ Error: Provider configuration not present
│
│ To work with data.aws_acm_certificate.cloudfront_wildcard_product_env_certificate its original provider
│ configuration at provider["registry.terraform.io/hashicorp/aws"].cloudfront-acm-us-east-1 is required, but it
│ has been removed. This occurs when a provider configuration is removed while objects created by that provider
│ still exist in the state. Re-add the provider configuration to destroy
│ data.aws_acm_certificate.cloudfront_wildcard_product_env_certificate, after which you can remove the provider
│ configuration again.
The data resource looks like this,
data "aws_acm_certificate" "cloudfront_wildcard_product_env_certificate" {
provider = aws.cloudfront-acm-us-east-1
domain = "*.${var.product}.${var.environment}.xyz.com"
statuses = ["ISSUED"]
}
After further research I found that by removing below line, it works as expected.
provider = aws.cloudfront-acm-us-east-1
Not sure what is the reason.
It appears that you were using a multi-provider configuration in the former repo. I.e. you were probably using one provider block like
provider "aws" {
region = "some-region"
access_key = "..."
secret_key = "..."
}
and a second like
provider "aws" {
alias = "cloudfront-acm-us-east-1"
region = "us-east-1"
access_key = "..."
secret_key = "..."
}
Such a setup can be used if you need to create or access resources in multiple regions or multiple accounts.
Terraform will use the first provider by default to create resources (or to lookup in case of data sources) if there is no provider specified in the resource block or data source block.
With the provider argument in
data "aws_acm_certificate" "cloudfront_wildcard_product_env_certificate" {
provider = aws.cloudfront-acm-us-east-1
domain = "*.${var.product}.${var.environment}.xyz.com"
statuses = ["ISSUED"]
}
you tell Terraform to use a specific provider.
I assume you did not move the second provider config to the new repo, but you still tell Terraform to use a specific provider which is not there. By removing the provider argument, Terraform will use the default provider for aws.
Further possible reason for this error message
Just for completeness:
The same error message can appear also in a slightly different setting, where you have a multi-provider config with resources created via the second provider. If you now remove the resource config of these resources from the Terraform config and at the same time remove the specific provider config, then Terraform will not be able to destroy the resources via the specific provider and thus show the error message like in your post.
Literally, the error message indicates this second setting, but it does not fit exactly to your problem description.

Terraform Validate Error - Snowflake Automation

I am working on self-development to better see how I can implement Infrastructure as Code (Terraform) for a Snowflake Environment.
I have a GitHub repo with GitHub actions configured workflow that does the following:
setups up terraform cloud alongside the following
Setups up terraform v1.1.2
Runs Terraform fmt -check
Terraform validate
Terraform plan
Terraform apply
Public Repo https://github.com/waynetaylor/sfguide-terraform-sample/blob/main/.github/workflows/actions.yml here which pretty much is following github actions for terraform cloud steps.
I have configured TF cloud and if I run the terraform validate step this fails with environment variables for snowflake - whether I run locally or remotely via actions. However, if I run a terraform plan and apply and exclude the terraform validate it works.
Example error
Error: Missing required argument
│
│ on main.tf line 27, in provider "snowflake":
│ 27: provider "snowflake" {
│
│ The argument "account" is required, but no definition was found.
The snowflake provider documentation suggests that there are three required values: username, account, and region.
Where you call your provider in your code you'll need to provide those values.
e.g.
from
provider "snowflake" {
alias = "sys_admin"
role = "SYSADMIN"
}
to
provider "snowflake" {
// required
username = "..."
account = "..."
region = "..."
alias = "sys_admin"
role = "SYSADMIN"
}

Terraform partial remote backend cannot contain interpolations?

I am trying to configure a Terraform enterprise workspace in Jenkins on the fly. To do this, I need to be able to set the remote backend workspace name in my main.tf dynamically. Like this:
# Using a single workspace:
terraform {
backend "remote" {
hostname = "app.xxx.xxx.com"
organization = "YYYY"
# new workspace variable
workspaces {
name = "${var.workspace_name}"
}
}
}
Now when I run:
terraform init -backend-config="workspace_name=testtest"
I get:
Error loading backend config: 1 error(s) occurred:
* terraform.backend: configuration cannot contain interpolations
The backend configuration is loaded by Terraform extremely early, before
the core of Terraform can be initialized. This is necessary because the backend
dictates the behavior of that core. The core is what handles interpolation
processing. Because of this, interpolations cannot be used in backend
configuration.
If you'd like to parameterize backend configuration, we recommend using
partial configuration with the "-backend-config" flag to "terraform init".
Is what I want to do possible with terraform?
You cann't put any variables "${var.workspace_name}" or interpolations into the Backend Remote State Store.
However, you can create a file beside with your Backend values, it could look like this into the main.tf file:
# Terraform backend State-Sotre
terraform {
backend "s3" {}
}
and into a dev.backend.tfvars for instance:
bucket = "BUCKET_NAME"
encrypt = true
key = "BUCKET_KEY"
dynamodb_table = "DYNAMODB_NAME"
region = "AWS_REGION"
role_arn = "IAM_ROLE_ARN"
You can use partial configuration for s3 Backend as well.
Hope it'll help.
Hey I found the correct way to do this:
While the syntax is a little tricky, the remote backend supports partial backend initialization. What this means is that the configuration can contain a backend block like this:
terraform {
backend "remote" { }
}
And then Terraform can be initialized with a dynamically set backend configuration like this (replacing ORG and WORKSPACE with appropriate values):
terraform init -backend-config "organization=ORG" -backend-config 'workspaces=[{name="WORKSPACE"}]'

Resources