Azure Terraform: Inserting "certificate-authority" data into Kube context during cluster creation - azure

We're using Terraform to deploy AKS clusters to an environment behind a proxy over VPN. Deployment of the cluster works correctly when off-network without the proxy, but errors out on Helm deployment creation on-network.
We are able to connect to the cluster after it's up while on the network using the following command after retrieving the cluster context.
kubectl config set-cluster <cluster name> --certificate-authority=<path to organization's root certificate in PEM format>
The Helm deployments are also created with Terraform after the creation of the cluster. It seems that these require the certificate-authority data to deploy and we haven't been able to find a way to automate this at the right step in the process. Consequently, the apply fails with the error:
x509: certificate signed by unknown authority
Any idea how we can get the certificate-authority data in the right place so the Helm deployments stop failing? Or is there a way to get the cluster to implicitly trust that root certificate? We've tried a few different things:
Researched if you could automatically have that data in there when retrieving the cluster context (i.e. az aks get-credentials --name <cluster name> --resource-group <cluster RG>)?** Couldn't find an easy way to accomplish this.
We started to consider adding the root cert info as part of the kubeconfig that's generated during deployment (rather than the one you create when retrieving the context). The idea is that it can be passed in to the kubernetes/helm providers and also leveraged when running kubectl commands via local-exec blocks. We know that works but that means that we couldn't find a way to automate that via Terraform.
We've tried providing the root certificate to the different fields of the provider config, shown below. We've specifically tried a few different things with cluster_ca_certificate, namely providing the PEM-style cert of the root CA.
provider "kubernetes" {
host = module.aks.kube_config.0.host
client_certificate = base64decode(module.aks.kube_config.0.client_certificate)
client_key = base64decode(module.aks.kube_config.0.client_key)
cluster_ca_certificate = base64decode(module.aks.kube_config.0.cluster_ca_certificate)
}
provider "helm" {
version = ">= 1.2.4"
kubernetes {
host = module.aks.kube_config.0.host
client_certificate = base64decode(module.aks.kube_config.0.client_certificate)
client_key = base64decode(module.aks.kube_config.0.client_key)
cluster_ca_certificate = base64decode(module.aks.kube_config.0.cluster_ca_certificate)
}
}
Thanks in advance for the help! Let me know if you need any additional info. I'm still new to the project so I may not have explained everything correctly.

In case anyone finds this later, we ultimately ended up just breaking the project up into two parts: cluster creation and bootstrap. This let us add a local-exec block in the middle to run the kubectl config set-cluster... command. So the order of operations is now:
Deploy AKS cluster (which copies Kube config locally as one of the Terraform outputs)
Run the command
Deploy microservices
Because we're using Terragrunt, we can just use its apply-all function to execute both operations, setting the dependencies described here.

Related

Cluster Access Issue in Azure Using Terraform

Error: authentication is not configured for provider. Please configure it through one of the following options: 1. DATABRICKS_HOST + DATABRICKS_TOKEN environment variables. 2. host + token provider arguments. 3. azure_databricks_workspace_id + AZ CLI authentication. 4. azure_databricks_workspace_id + azure_client_id + azure_client_secret + azure_tenant_id for Azure Service Principal authentication. 5. Run databricks configure --token that will create ~/.databrickscfg file.
Please check and Advice about this error
You can try with below workaround given in this github discussion to troubleshoot your issue
Overall recommendation is to also separate workspace creation (azurerm provider) and for workspace provisioning (databricks provider).
the other workaround is to have an empty ~/.databrickscfg file, so locals block might be avoided. not ideal, but will work..
you can also managed to work around it by using locals to pre-configure related resource names and then reference those when building the workspace resource id in the provider config
and databricks provider resources all have a depends_on block with the databricks workspace

Apply a `configMap` to EKS cluster with Terraform

I am trying to apply a configMap to an EKS cluster through Terraform, but I don't see how. There is lots of documentation about this, but I don't see anyone succeeding with it, so I am not sure if this is possible or not.
Currently we control our infrastructure through Terraform. When I create the .kube/config file through AWS cli, and try to connect to the cluster, I get the Unauthorized error, which is documented how to solve here; in AWS. According to the docs, we need to edit aws-auth configMap and add some lines to it, which configures API server to accept requests from a VM with certain role. The problem is that only cluster creator has access to connect to the cluster and make these changes. The cluster creator in this case is Terraform, so what we do is aws config, we add the credentials of Terraform to the VM from where we are trying to connect to the cluster, we successfully authenticate against it, add the necessary lines to the configMap, then revoke the credentials from the VM.
From there on, any user can connect to the cluster from that VM, which is our goal. Now we would like to be able to edit the configMap through Terraform object, instead of doing all this process. There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS, so it is not being able to find the cluster, and fails with trying to connect to the API server running in localhost.
There is a resource kubernetes_config_map in Terraform, but that's a different provider (kubernetes), not AWS
It is a different provider, because Terraform should now interact with a different API (Kubernetes API instead of AWS API).
There are data sources for aws_eks_cluster and aws_eks_cluster_auth that can be used to authenticate the kubernetes provider.
The aws_eks_cluster_auth has examples for authenticating the kubernetes provider:
data "aws_eks_cluster" "example" {
name = "example"
}
data "aws_eks_cluster_auth" "example" {
name = "example"
}
provider "kubernetes" {
host = data.aws_eks_cluster.example.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.example.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.example.token
load_config_file = false
}
Another example is how the Cloud Posse AWS EKS module authenticate the kubernetes provider and also use a ConfigMap.

Terraform Kubernetes persistent storage setup no connection made dial tcp error

I am getting this error when ever I try to create a persistent claim and volume according this kubernetes_persistent_volume_claim
Error: Post "http://localhost/api/v1/namespaces/default/persistentvolumeclaims": dial tcp [::1]:80: connectex: No connection could
be made because the target machine actively refused it.
I have also tried spooling a azure disk and creating a volume through that outlined here Persistent Volume using Azure Managed Disk
My terraform kubernetes provider looks like this:
provider "kubernetes" {
alias = "provider_kubernetes"
host = module.kubernetes-service.kube_config.0.host
username = module.kubernetes-service.kube_config.0.username
password = module.kubernetes-service.kube_config.0.password
client_certificate = base64decode(module.kubernetes-service.kube_config.0.client_certificate)
client_key = base64decode(module.kubernetes-service.kube_config.0.client_key)
cluster_ca_certificate = base64decode(module.kubernetes-service.kube_config.0.cluster_ca_certificate)
}
I don't believe its even hitting K8 in my RG. Is there something I am missing or maybe I am not understanding how this works to put it together the right way. I have the RG spooled with the K8 resource in the same terraform which creates fine but when it comes to setting up the persistent storage I can't get past the error.
The provider is aliased, so first make sure that all kubernetes resources use the correct provider. You have to specify the aliased provider for each resource.
resource "kubernetes_cluster_role_binding" "current" {
provider = kubernetes.provider_kubernetes
# [...]
}
Another possibility is, that the localhost connection error may be, because there is a pending change to the Kubernetes cluster resource which leads to its return attributes being in known-after-apply state.
Try terraform plan --target module.kubernetes-service.kube_config to see if that shows any pending changes to the K8s resource (it presumably depends on). Better, target the Kubernetes cluster resource directly.
If it does, first apply those changes alone: terraform apply --target module.kubernetes-service.kube_config, then run a second apply without --target like this: terraform apply.
If there is no pending change to the cluster resource, check that the module returns correct credentials. Also double check, that the use of base64decode is correct.
Try terraform plan --target module.kubernetes-service.kube_config to see if >that shows any pending changes to the K8s resource (it presumably depends on). >Better, target the Kubernetes cluster resource directly.
If it does, first apply those changes alone: terraform apply --target >module.kubernetes-service.kube_config, then run a second apply without -->target like this: terraform apply.
In my case it was a conflict in the IAM role definition and assignment which caused the problem. Executing terraform plan --target module.eks (module.eks being the module name used in the terraform code) followed by terraform apply --target module.eks removed the conflicting role definitions. From the terraform output I could see which role policy and role was causing the issue.

Terraform back-end to azure blob storage errors

I have been using the below to successfully create a back-end state file for terraform in Azure storage, but for some reason its stopped working. I've recycled passwords for the storage, trying both keys and get the same error every-time
backend.tf
terraform {
backend "azurerm" {
storage_account_name = "terraformstorage"
resource_group_name = "automation"
container_name = "terraform"
key = "testautomation.terraform.tfstate"
access_key = "<storage key>"
}
}
Error returned
terraform init
Initializing the backend...
Successfully configured the backend "azurerm"! Terraform will automatically
use this backend unless the backend configuration changes.
Error refreshing state: storage: service returned error: StatusCode=403, ErrorCode=AuthenticationFailed, ErrorMessage=Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature.
RequestId:665e0067-b01e-007a-6084-97da67000000
Time:2018-12-19T10:18:18.7148241Z, RequestInitiated=Wed, 19 Dec 2018 10:18:18 GMT, RequestId=665e0067-b01e-007a-6084-97da67000000, API Version=, QueryParameterName=, QueryParameterValue=
Any ideas what im doing wrong?
What worked for me is to delete the local .terraform folder and try again.
Another problem can be time resolution.
I experienced those problems as well, tried all the above mentioned steps, but nothing helped.
What happened on my system (Windows 10, WSL2) was, that WSL lost its time sync and I was hours apart. This behaviour is described in https://github.com/microsoft/WSL/issues/4245.
For me it helped to
get the appropriate time in WSL (sudo hwclock -s) and
to reboot WSL
Hope, this will help others too.
Here are few suggestions:
Run: terraform init -reconfigure.
Confirm your "terraform/backend" credentials.
In case your Terraform contains some "azurerm_storage_account/network_rules" to allow certain IP addresses, or make sure you're connected to the right VPN network.
If above won't work, run TF_LOG=TRACE terraform init to debug further.
Please ensure you've been authenticated properly to Azure Cloud.
If you're running Terraform externally, re-run: az login.
If you're running Terraform on the instance, you can use managed identities, or by defining the following environmental variables:
ARM_USE_MSI=true
ARM_SUBSCRIPTION_ID=xxx-yyy-zzz
ARM_TENANT_ID=xxx-yyy-zzz
or just run az login --identity, then assign the right role (azurerm_role_assignment, e.g. "Contributor") and appropriate policies (azurerm_policy_definition).
See also:
Azure Active Directory Provider: Authenticating using Managed Service Identity.
Unable to programmatically get the keys for Azure Storage Account.
There should a .terraform directory , where you are running the terraform init command from.
Remove .terraform or move it to Someotehr name. Next time terraform init runs , it will recreate that directory with new init.

How to Integrate GitLab-Ci w/ Azure Kubernetes + Kubectl + ACR for Deployments?

Our previous GitLab based CI/CD utilized an Authenticated curl request to a specific REST API endpoint to trigger the redeployment of an updated container to our service, if you use something similar for your Kubernetes based deployment this Question is for you.
More Background
We run a production site / app (Ghost blog based) on an Azure AKS Cluster. Right now we manually push our updated containers to a private ACR (Azure Container Registry) and then update from the command line with Kubectl.
That being said we previously used Docker Cloud for our orchestration and fully integrated re-deploying our production / staging services using GitLab-Ci.
That GitLab-Ci integration is the goal, and the 'Why' behind this question.
My Question
Since we previously used Docker Cloud (doh, should have gone K8s from the start) how should we handle the fact that GitLab-Ci was able to make use of Secrets created the Docker Cloud CLI and then authenticate with the Docker Cloud API to trigger actions on our Nodes (ie. re-deploy with new containers etc).
While I believe we can build a container (to be used by our GitLab-Ci runner) that contains Kubectl, and the Azure CLI, I know that Kubernetes also has a similar (to docker cloud) Rest API that can be found here (https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster) — specifically the section that talks about connecting WITHOUT Kubectl appears to be relevant (as does the piece about the HTTP REST API).
My Question to anyone who is connecting to an Azure (or potentially other managed Kubernetes service):
How does your Ci/CD server authenticate with your Kubernetes service provider's Management Server, and then how do you currently trigger an update / redeployment of an updated container / service?
If you have used the Kubernetes HTTP Rest API to re-deploy a service your thoughts are particularly value-able!
Kubernetes Resources I am Reviewing
How should I manage deployments with kubernetes
Kubernetes Deployments
Will update as I work through the process.
Creating the integration
I had the same problem of how to integrate the GitLab CI/CD with my Azure AKS Kubernetes cluster. I created this question because I was having some error when I tried to add my Kubernetes cluester info into GitLab.
How to integrate them:
Inside GitLab, go to "Operations" > "Kubernetes" menu.
Click on the "Add Kubernetes cluster" button on the top of the page
You will have to fill some form fields, to get the content that you have to put into these fields, connect to your Azure account from the CLI (you need Azure CLI installed on your PC) using az login command, and then execute this other command to get the Kubernetes cluster credentials: az aks get-credentials --resource-group <resource-group-name> --name <kubernetes-cluster-name>
The previous command will create a ~/.kube/config file, open this file, the content of the fields that you have to fill in the GitLab "Add Kubernetes cluster" form are all inside this .kube/config file
These are the fields:
Kubernetes cluster name: It's the name of your cluster on Azure, it's in the .kube/config file too.
API URL: It's the URL in the field server of the .kube/config file.
CA Certificate: It's the field certificate-authority-data of the .kube/config file, but you will have to base64 decode it.
After you decode it, it must be something like this:
-----BEGIN CERTIFICATE-----
...
some base64 strings here
...
-----END CERTIFICATE-----
Token: It's the string of hexadecimal chars in the field token of the .kube/config file (it might also need to be base 64 decoded?). You need to use a token belonging to an account with cluster-admin privileges, so GitLab can use it for authenticating and installing stuff on the cluster. The easiest way to achieve this is by creating a new account for GitLab: create a YAML file with the service account definition (an example can be seen here under Create a gitlab service account in the default namespace) and apply it to your cluster by means of kubectl apply -f serviceaccount.yml.
Project namespace (optional, unique): I leave it empty, don't know yet for what or where this namespace can be used.
Click in "Save" and it's done. Your GitLab project must be connected to your Kubernetes cluster now.
Deploy
In your deploy job (in the pipeline), you'll need some environment variables to access your cluster using the kubectl command, here is a list of all the variables available:
https://docs.gitlab.com/ee/user/project/clusters/index.html#deployment-variables
To have these variables injected in your deploy job, there are some conditions:
You must have added correctly the Kubernetes cluster into your GitLab project, menu "Operations" > "Kubernetes" and these steps that I described above
Your job must be a "deployment job", in GitLab CI, to be considered a deployment job, your job definition (in your .gitlab-ci.yml) must have an environment key (take a look at the line 31 in this example), and the environment name must match the name you used in menu "Operations" > "Environments".
Here are an example of a .gitlab-ci.yml with three stages:
Build: it builds a docker image and push it to gitlab private registry
Test: it doesn't do anything yet, just put an exit 0 to change it later
Deploy: download a stable version of kubectl, copy the .kube/config file to be able to run kubectl commands in the cluster and executes a kubectl cluster-info to make sure it is working. In my project I didn't finish to write my deploy script to really execute a deploy. But this kubectl cluster-info command is executing fine.
Tip: to take a look at all the environment variables and their values (Jenkins has a page with this view, GitLab CI doesn't) you can execute the command env in the script of your deploy stage. It helps a lot to debug a job.
I logged into our GitLab-Ci backend today and saw a 'Kubernetes' button — along with an offer to save $500 at GCP.
GitLab Kubernetes
URL to hit your repo's Kubernetes GitLab page is:
https://gitlab.com/^your-repo^/clusters
As I work through the integration process I will update this answer (but also welcome!).
Official GitLab Kubernetes Integration Docs
https://docs.gitlab.com/ee/user/project/clusters/index.html

Resources