Terraform Kubernetes persistent storage setup no connection made dial tcp error - azure

I am getting this error when ever I try to create a persistent claim and volume according this kubernetes_persistent_volume_claim
Error: Post "http://localhost/api/v1/namespaces/default/persistentvolumeclaims": dial tcp [::1]:80: connectex: No connection could
be made because the target machine actively refused it.
I have also tried spooling a azure disk and creating a volume through that outlined here Persistent Volume using Azure Managed Disk
My terraform kubernetes provider looks like this:
provider "kubernetes" {
alias = "provider_kubernetes"
host = module.kubernetes-service.kube_config.0.host
username = module.kubernetes-service.kube_config.0.username
password = module.kubernetes-service.kube_config.0.password
client_certificate = base64decode(module.kubernetes-service.kube_config.0.client_certificate)
client_key = base64decode(module.kubernetes-service.kube_config.0.client_key)
cluster_ca_certificate = base64decode(module.kubernetes-service.kube_config.0.cluster_ca_certificate)
}
I don't believe its even hitting K8 in my RG. Is there something I am missing or maybe I am not understanding how this works to put it together the right way. I have the RG spooled with the K8 resource in the same terraform which creates fine but when it comes to setting up the persistent storage I can't get past the error.

The provider is aliased, so first make sure that all kubernetes resources use the correct provider. You have to specify the aliased provider for each resource.
resource "kubernetes_cluster_role_binding" "current" {
provider = kubernetes.provider_kubernetes
# [...]
}
Another possibility is, that the localhost connection error may be, because there is a pending change to the Kubernetes cluster resource which leads to its return attributes being in known-after-apply state.
Try terraform plan --target module.kubernetes-service.kube_config to see if that shows any pending changes to the K8s resource (it presumably depends on). Better, target the Kubernetes cluster resource directly.
If it does, first apply those changes alone: terraform apply --target module.kubernetes-service.kube_config, then run a second apply without --target like this: terraform apply.
If there is no pending change to the cluster resource, check that the module returns correct credentials. Also double check, that the use of base64decode is correct.

Try terraform plan --target module.kubernetes-service.kube_config to see if >that shows any pending changes to the K8s resource (it presumably depends on). >Better, target the Kubernetes cluster resource directly.
If it does, first apply those changes alone: terraform apply --target >module.kubernetes-service.kube_config, then run a second apply without -->target like this: terraform apply.
In my case it was a conflict in the IAM role definition and assignment which caused the problem. Executing terraform plan --target module.eks (module.eks being the module name used in the terraform code) followed by terraform apply --target module.eks removed the conflicting role definitions. From the terraform output I could see which role policy and role was causing the issue.

Related

Does terraform guarantee that if no changes were reported by plan, it will be able to recreate resources the same way they currently are?

I have a lot of resources in my Azure subscription. I want to manage them with terraform, so I want to import them using terraform import. I import every resource manually one-by-one. Then I run terraform plan and check that there are no changes to be made reported i.e. that the current infrastructure matches the configuration.
Does this mean that if I were to manually delete some of the resources via Azure portal or cli, I would be able to recreate them wit terraform apply perfectly so that they would have exactly the same configuration as before and would operate in exactly the same way?
In general Terraform cannot guarantee that destroying an object and recreating it will produce an exactly equivalent object.
It is possible for that to work, but it requires a number of things to be true, including:
Your configuration specifies the values for resource arguments exactly as they are in the remote API. For example, if a particular resource type has a case-insensitive (but case-preserving) name then a provider will typically ignore differences in case when planning changes but it will use exactly the case you wrote in the configuration, potentially selecting a different name.
The resource type does not include any "write-only" arguments. Some resource types have arguments that are used only by the provider itself and so they don't get saved as part of the object in the remote API even though they are saved in the Terraform state. terraform import therefore cannot re-populate those into the state, because there is nowhere else to read them from except the Terraform state.
The provider doesn't have any situations where it treats an omitted argument as "ignore the value in the remote system" instead of "unset the value in the remote system". Some providers make special exceptions for certain arguments where leaving them unset allows them to "drift" in the remote API without Terraform repairing them, but if you are using any resource types which behave in that way then the value stored in the remote system will be lost when you delete the remote object and Terraform won't be able to restore that value because it's not represented in your Terraform configuration.
The hashicorp/azurerm provider in particular has many examples of situation 3 in the above list. For example, if you have an azurerm_virtual_network resource which does not include any subnet blocks then the provider will not propose to delete any existing subnets, even though the configuration says that there should be no subnets. However, if you delete the virtual network and then ask Terraform to recreate it then the Terraform configuration has no record of what subnets were supposed to exist and so it will propose to create a network with no subnets at all.

how to change terraform provider?

Currently, I am using "Mongey/kafka" provider and now I have to switch to "confluentinc/confluent" provider with my existing terraform pipeline.
How can I do this ?
Steps currently following to switch the provider
Changing the provider in main.tf file and running following command to replace provider
terraform state replace-provider Mongey/kafka confluentinc/confluent
and after that I run
terraform init command to install the new provider
But after that when I am running
terraform plan
it is giving "no schema available for module.iddn_news_cms_kafka_topics.kafka_acl.topic_writer[13] while reading state; this is a bug in terraform and should be reported" error.
Is there any way, I will change the terraform provider without disturbing the existing resources created using terraform pipeline ?
The terraform state replace-provider command is intended for switching between providers that are in some way equivalent to one another, such as the hashicorp/google and hashicorp/google-beta providers, or when someone forks a provider into their own namespace but remains compatible with the original provider.
Mongey/kafka and confluentinc/confluent do both have resource types that seem to represent the same concepts in the remote system:
Mongey/kafka
confluentinc/confluent
kafka_acl
confluent_kafka_acl
kafka_quota
confluent_kafka_client_quota
kafka_topic
confluent_kafka_topic
However, despite representing the same concepts in the remote system these resource types have different names and incompatible schemas, so there is no way to migrate directly between them. Terraform has no way to understand which resource types in one provider match with resource types in another, or to understand how to map attributes from one of the resource types onto corresponding attributes of the other.
Instead, I think the best thing to do here would be to ask Terraform to "forget" the objects and then re-import them into the new resource types:
terraform state rm kafka_acl.example to ask Terraform to forget about the remote object associated with kafka_acl.example. There is no undo for this action.
terraform import confluent_kafka_acl.example OBJECT-ID to bind the OBJECT-ID (as described in the documentation) to confluent_kafka_acl.example.
I suggest practicing this in a non-production environment first so that you can be confident about the behavior of each of these commands, and learn how to translate from whatever ID format the Mongey/kafka provider uses into whatever import ID format the confluentinc/confluent provider uses to describe the same objects.

Accessing existing resource info from new resources

My header might not have summed up correctly my question.
So I have a terraform stack that creates a resource group, and a keyvault, amongst other things. This has already been ran and the resources exist.
I am now adding another resource to this same terraform stack. Namely a mysql server. Now I know if I just re-run the stack it will check the state file and just add my mysql server.
However as part of this mysql server creation I am providing a password and I want to write this password to the keyvault that already exists.
if I was doing this from the start my terraform would look like:
resource "azurerm_key_vault_secret" "sqlpassword" {
name = "flagr-mysql-password"
value = random_password.sqlpassword.result
key_vault_id = azurerm_key_vault.shared_kv.id
depends_on = [
azurerm_key_vault.shared_kv
]
}
however I believe as the keyvault already exists this would error as it wouldn't know this value azurerm_key_vault.shared_kv.id unless I destroy the keyvault and allow terraform to recreate it. is that correct?
I could replace azurerm_key_vault.shared_kv.id with the actual resource ID from azure, but then if I were to ever run this stack to create a new environment it would be writing the value into my old keyvault I presume?
I have done this recently for AWS deployment, you would do terraform import on azurerm_key_vault.shared_kv resource to bring it under terraform management and then you would be able to able to deploy azurerm_key_vault_secret
To import: you will need to build the resource azurerm_key_vault.shared_kv (this will require a few iterations).

Terraform back-end to azure blob storage errors

I have been using the below to successfully create a back-end state file for terraform in Azure storage, but for some reason its stopped working. I've recycled passwords for the storage, trying both keys and get the same error every-time
backend.tf
terraform {
backend "azurerm" {
storage_account_name = "terraformstorage"
resource_group_name = "automation"
container_name = "terraform"
key = "testautomation.terraform.tfstate"
access_key = "<storage key>"
}
}
Error returned
terraform init
Initializing the backend...
Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.
Error refreshing state: storage: service returned error: StatusCode=403, ErrorCode=AuthenticationFailed, ErrorMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:665e0067-b01e-007a-6084-97da67000000
Time:2018-12-19T10:18:18.7148241Z, RequestInitiated=Wed, 19 Dec 2018 10:18:18 GMT, RequestId=665e0067-b01e-007a-6084-97da67000000, API Version=, QueryParameterName=, QueryParameterValue=
Any ideas what im doing wrong?
What worked for me is to delete the local .terraform folder and try again.
Another problem can be time resolution.
I experienced those problems as well, tried all the above mentioned steps, but nothing helped.
What happened on my system (Windows 10, WSL2) was, that WSL lost its time sync and I was hours apart. This behaviour is described in https://github.com/microsoft/WSL/issues/4245.
For me it helped to
get the appropriate time in WSL (sudo hwclock -s) and
to reboot WSL
Hope, this will help others too.
Here are few suggestions:
Run: terraform init -reconfigure.
Confirm your "terraform/backend" credentials.
In case your Terraform contains some "azurerm_storage_account/network_rules" to allow certain IP addresses, or make sure you're connected to the right VPN network.
If above won't work, run TF_LOG=TRACE terraform init to debug further.
Please ensure you've been authenticated properly to Azure Cloud.
If you're running Terraform externally, re-run: az login.
If you're running Terraform on the instance, you can use managed identities, or by defining the following environmental variables:
ARM_USE_MSI=true
ARM_SUBSCRIPTION_ID=xxx-yyy-zzz
ARM_TENANT_ID=xxx-yyy-zzz
or just run az login --identity, then assign the right role (azurerm_role_assignment, e.g. "Contributor") and appropriate policies (azurerm_policy_definition).
See also:
Azure Active Directory Provider: Authenticating using Managed Service Identity.
Unable to programmatically get the keys for Azure Storage Account.
There should a .terraform directory , where you are running the terraform init command from.
Remove .terraform or move it to Someotehr name. Next time terraform init runs , it will recreate that directory with new init.

Terraform and Updates

Being able to capture infrastructure in a single Terraform file has obvious benefits. However, I am not clear in my mind how - once, for example, a virtual machine has been created - subsequent updates are handled.
So, to provide a specific scenario. Suppose that using Terraform we set up an Azure vm with SQL Server 2014. Then, after a month we decide that we should like to update that vm with the latest service pack for SQL Server 2014 that has just been released.
Is the recommended practice that we update the Terraform configuration file and re-apply it?
I have to disagree with the other two responses. Terraform can handle infrastructure updates just fine. The key thing to understand, however, is that Terraform largely follows an immutable infrastructure paradigm, which means that to "update" a resource, you delete the old resource and create a new one to replace it. This is much like functional programming, where variables are immutable, and to "update" something, you actually create a new variable.
The typical pattern with Terraform is to use it to deploy a server image, such as an Virtual Machine (VM) Image (e.g. an Amazon Machine Image (AMI)) or a Container Image (e.g. a Docker Image). When you want to "update" something, you create a new version of your image, deploy that onto a new server, and undeploy the old server.
Here's an example of how that works:
Imagine that you're building a Ruby on Rails app. You get the app working in dev and it's time to deploy to prod. The first step is to package the app as an AMI. You could do this using a tool like Packer. Now you have an AMI with id ami-1234.
Here is a Terraform template you could use to deploy this AMI on a server (an EC2 Instance) in AWS with an Elastic IP Address attached to it:
resource "aws_instance" "example" {
ami = "ami-1234"
instance_type = "t2.micro"
}
resource "aws_eip" "example" {
instance = "${aws_instance.example.id}"
}
When you run terraform apply, Terraform deploys the server, attaches an IP address to it, and now when users visit that IP, they will see v1 of your Rails app.
Some time later, you update your Rails app and want to deploy the new version, v2. To do that, you build a new AMI (i.e. you run Packer again) to get an ami with ID "ami-5678". You update your Terraform templates accordingly:
resource "aws_instance" "example" {
ami = "ami-5678"
instance_type = "t2.micro"
}
When you run terraform apply, Terraform undeploys the old server (which it can find because Terraform records the state of your infrastructure), deploys a new server with the new AMI, and now users will see v2 of your code at that same IP.
Of course, there is one problem here: in between the time when Terraform undeploys v1 and when it deploys v2, your users would see downtime. To work around that, you could use Terraform's create_before_destroy lifecycle setting:
resource "aws_instance" "example" {
ami = "ami-5678"
instance_type = "t2.micro"
lifecycle {
create_before_destroy = true
}
}
With create_before_destroy set to true, Terraform will create the replacement server first, switch the IP to it, and then remove the old server. This allows you to do zero-downtime deployment with immutable infrastructure (note: zero-downtime deployment works better with a load balancer that can do health checks than a simple IP address, especially if your server takes a long time to boot).
For more information on this, check out the book Terraform: Up & Running. The code samples for the book include an example of a zero-downtime deployment with a cluster of servers and a load balancer: https://github.com/brikis98/terraform-up-and-running-code
Terraform is an infrastructure provision tool, th configuration/deployment tools will be:
chef
saltstack
ansible
etc.,
As I am working with chef, so basically, I provision the server instance by terraform, then terraform (terraform provisioner) handles the control to chef for system configuration and deployment.
For the moment, terraform cannot delete the node/client in chef server, so after you terraform destroy, you need remove them by yourself.
Terraform isn't best placed for this sort of task. Terraform is an infrastructure management tool, not configuration management.
You should use tools such as chef, puppet, and ansible to deal with the configuration of the system.
If you must use terraform for this task; you could create a template_file resource and place in the configuration required to install the SQL server, and how to upgrade if a different version is presented. Reference: here
Put that code inside a provisioner under the null_resource resource. reference: here.
The trigger for this could be the variable containing the SQL version. So, when you present a different version of SQL it'll execute that provisioner on each instance to upgrade the versions.

Resources