I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
subnet {
name = "appgwsubnet"
address_prefix = var.app_gateway_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
resource "azurerm_role_assignment" "ra1" {
scope = data.azurerm_subnet.kubesubnet.id
role_definition_name = "Network Contributor"
principal_id = local.client_objectid
depends_on = [data.azurerm_subnet.kubesubnet]
}
and followed the below steps to install the ISTIO as per the ISTIO documentation
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
## Install the App and Gateway
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.16/samples/bookinfo/platform/kube/bookinfo.yaml
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.16/samples/bookinfo/networking/bookinfo-gateway.yaml
# Check the Services, Pods and Gateway
kubectl get services
kubectl get pods
kubectl get gateway
# Ensure the app is running
kubectl exec "$(kubectl get pod -l app=ratings -o jsonpath='{.items[0].metadata.name}')" -c ratings -- curl -sS productpage:9080/productpage | grep -o "<title>.*</title>"
and it is responding as shown below
# Check the
$INGRESS_NAME="istio-ingress"
$INGRESS_NS="istio-ingress"
kubectl get svc "$INGRESS_NAME" -n "$INGRESS_NS"
it returns the external IP as shown below
However, I am not able to access the application
Also I am getting an error while trying to run the following commands to find the ports
kubectl -n "$INGRESS_NS" get service "$INGRESS_NAME" -o jsonpath='{.spec.ports[?(#.name=="http2")].port}'
kubectl -n "$INGRESS_NS" get service "$INGRESS_NAME" -o jsonpath='{.spec.ports[?(#.name=="https")].port}'
kubectl -n "$INGRESS_NS" get service "$INGRESS_NAME" -o jsonpath='{.spec.ports[?(#.name=="tcp")].port}'
This is because the ingress gateway selector when installed with Helm is istio: ingress, instead of istio: ingressgateway when installed with istioctl.
If you modify the Gateway to reflect this, then it should work:
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: bookinfo-gateway
namespace: default
spec:
selector:
istio: ingress
...
One way to show this (without knowing this issue previously) is with istioctl analyze:
$ istioctl analyze
Error [IST0101] (Gateway default/bookinfo-gateway) Referenced selector not found: "istio=ingressgateway"
Error: Analyzers found issues when analyzing namespace: default.
See https://istio.io/v1.16/docs/reference/config/analysis for more information about causes and resolutions.
This is because you have hit general concerns of istio- prefix get striped, from the steps by steps installation with istio-ingress will stripe with ingress, so if you using istio-ingressgateway that could match with app selector , or change the app selector to match with it.
I tried to reproduce the same in my environment to create Sample Bookinginfo application, as i got same error.
To resolve the application issue, kindly follow the below steps.
Install ISTIO using below cmdlets.
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
#Install the Application
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.16/samples/bookinfo/platform/kube/bookinfo.yaml
Check the Services, Pods using below cmdlet.
kubectl get services
kubectl get pods
Expose the application to the internet using following command.
kubectl expose svc productpage --type=LoadBalancer --name=productpage-external --port=9080 --target-port=9080
Check the External IP of the service using below command.
kubectl get svc productpage-external
Access the application using External IP and Port in the browser.
Ex url: http://20.121.165.179:9080/productpage
Related
I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
and followed the below steps and installed the ISTIO as per the ISTIO documentation
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
## Install the App and Gateway
kubectl apply -f bookinfo.yaml
kubectl apply -f bookinfo-gateway.yaml
and I could access the application like shown below
This application prints the logs in the console.
I have created the Log Analytics workspace as mentioned below
# Create Log Analytics Workspace
module "log_analytics_workspace" {
source = "./modules/log_analytics_workspace"
count = var.enable_log_analytics_workspace == true ? 1 : 0
app_or_service_name = "log"
subscription_type = var.subscription_type
environment = var.environment
resource_group_name = azurerm_resource_group.rg.name
location = var.location
instance_number = var.instance_number
sku = var.log_analytics_workspace_sku
retention_in_days = var.log_analytics_workspace_retention_in_days
tags = var.tags
}
and set the log_anaglytics_workspace_id in the oms_agent section in azurerm_kubernetes_cluster to collect logs and ship to Log Analytics.
Is that all needed to ship the application logs to Log Analytics Workspace? somehow I don't see the logs flowing? am I missing something?
I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
subnet {
name = "appgwsubnet"
address_prefix = var.app_gateway_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
resource "azurerm_role_assignment" "ra1" {
scope = data.azurerm_subnet.kubesubnet.id
role_definition_name = "Network Contributor"
principal_id = local.client_objectid
depends_on = [data.azurerm_subnet.kubesubnet]
}
and followed the below steps to install the ISTIO as per the ISTIO documentation
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
and deployed the following app and service
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
labels:
app: httpbin
service: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
serviceAccountName: httpbin
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
and added the secrets like mentioned below
kubectl create -n istio-system secret tls httpbin-credential --key=example_certs1/httpbin.example.com.key --cert=example_certs1/httpbin.example.com.crt
and Gateway installed like mentioned below
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: mygateway
spec:
selector:
istio: ingress # use istio default ingress gateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: httpbin-credential # must be the same as secret
hosts:
- httpbin.example.com
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
spec:
hosts:
- "httpbin.example.com"
gateways:
- mygateway
http:
- match:
- uri:
prefix: /status
- uri:
prefix: /delay
route:
- destination:
port:
number: 8000
host: httpbin
and updated the host file like mentioned below
However I can't access the Application
Note: I can access an another application that doesn't use SSL
It's now quite a few days that I'm trying to configure the cluster on AKS but I keep jumping between parts of the docs, various questions here on SO, articles on Medium.. all to keep failing at it.
The goal is get a static ip with a dns that I can use to connect my apps to the server deployed on AKS.
I have created via terraform the infrastructure which consists of a resource group in which I created a Public IP and the AKS cluster, so far so good.
After trying to use the ingress controller that gets installed when you use the option http_application_routing_enabled = true on cluster creation which the docs are discouraging for production https://learn.microsoft.com/en-us/azure/aks/http-application-routing, I'm trying the recommended way and install the ingress-nginx controller via Helm https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli.
In terraform I'm installing it all like this
resource group and cluster
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "Test"
Team = "DevOps"
}
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
# vm_size = "standard_b2s_v5"
# vm_size = "standard_e2bs_v5"
vm_size = "standard_b4ms"
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
# enable_node_public_ip = true
}
service_principal {
client_id = var.sp_client_id
client_secret = var.sp_client_secret
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "standard"
# load_balancer_sku = "basic"
}
# http_application_routing_enabled = true
http_application_routing_enabled = false
}
public ip
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
resource_group_name = var.resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
sku = "Standard"
}
load balancer
resource "kubernetes_service" "cluster-ingress" {
metadata {
name = "cluster-ingress-svc"
annotations = {
"service.beta.kubernetes.io/azure-load-balancer-resource-group" = "fixit-resource-group"
# Warning SyncLoadBalancerFailed 2m38s (x8 over 12m) service-controller Error syncing load balancer:
# failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236
# in resource group MC_fixit-resource-group_server_cluster_westeurope
# "service.beta.kubernetes.io/azure-load-balancer-resource-group" = "MC_fixit-resource-group_server_cluster_westeurope"
# kubernetes.io/ingress.class: addon-http-application-routing
}
}
spec {
# type = "Ingress"
type = "LoadBalancer"
load_balancer_ip = var.public_ip_address
selector = {
name = "cluster-ingress-svc"
}
port {
name = "cluster-port"
protocol = "TCP"
port = 3000
target_port = "80"
}
}
}
ingress controller
resource "helm_release" "nginx" {
name = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
namespace = "default"
set {
name = "rbac.create"
value = "false"
}
set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
set {
name = "controller.service.loadBalancerIP"
value = var.public_ip_address
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
value = "true"
}
# --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz
set {
name = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
}
}
but the installation fails with this message from terraform
Warning: Helm release "ingress-nginx" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│
│ with module.ingress_controller.helm_release.nginx,
│ on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│ 2: resource "helm_release" "nginx" {
│
╵
╷
│ Error: timed out waiting for the condition
│
│ with module.ingress_controller.helm_release.nginx,
│ on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│ 2: resource "helm_release" "nginx" {
the controller print out
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: default
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.2
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: default
service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.173.243
IPs: 10.0.173.243
IP: 52.157.90.236
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31709/TCP
Endpoints:
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 30045/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 32500
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 32s (x5 over 108s) service-controller Ensuring load balancer
Warning SyncLoadBalancerFailed 31s (x5 over 107s) service-controller Error syncing load balancer: failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236 in resource group mc_fixit-resource-group_server_cluster_westeurope
vincenzocalia#vincenzos-MacBook-Air helm_charts % az aks show --resource-group fixit-resource-group --name server_cluster --query nodeResourceGroup -o tsv
MC_fixit-resource-group_server_cluster_westeurope
Why is it looking in the MC_fixit-resource-group_server_cluster_westeurope resource group and not in the fixit-resource-group I created for the Cluster, Public IP and Load Balancer?
If I change the controller load balancer ip to the public ip in MC_fixit-resource-group_server_cluster_westeurope then terraform still outputs the same error, but the controller prints out to be correctly assigned to the ip and load balancer
set {
name = "controller.service.loadBalancerIP"
value = "20.73.192.77" #var.public_ip_address
}
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cluster-ingress-svc LoadBalancer 10.0.110.114 52.157.90.236 3000:31863/TCP 104m
ingress-nginx-controller LoadBalancer 10.0.106.201 20.73.192.77 80:30714/TCP,443:32737/TCP 41m
ingress-nginx-controller-admission ClusterIP 10.0.23.188 <none> 443/TCP 41m
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 122m
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: default
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.2
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: default
service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.106.201
IPs: 10.0.106.201
IP: 20.73.192.77
LoadBalancer Ingress: 20.73.192.77
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 30714/TCP
Endpoints:
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32737/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 32538
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 39m (x2 over 41m) service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 39m (x2 over 41m) service-controller Ensured load balancer
vincenzocalia#vincenzos-MacBook-Air helm_charts %
Reading here https://learn.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks
To enable this architecture, each AKS deployment spans two resource groups:
You create the first resource group. This group contains only the Kubernetes service resource. The AKS resource provider automatically creates the second resource group during deployment. An example of the second resource group is MC_myResourceGroup_myAKSCluster_eastus. For information on how to specify the name of this second resource group, see the next section.
The second resource group, known as the node resource group, contains all of the infrastructure resources associated with the cluster. These resources include the Kubernetes node VMs, virtual networking, and storage. By default, the node resource group has a name like MC_myResourceGroup_myAKSCluster_eastus. AKS automatically deletes the node resource group whenever the cluster is deleted, so it should only be used for resources that share the cluster's lifecycle.
Should I pass the first or the second group depending of what kind of resource I'm creating?
E.g. kubernetes_service needs 1st rg, while azurerm_public_ip needs the 2nd rg?
What is it that I'm missing out here?
Please explain it like I was 5 years old because I'm feeling like right now..
Many thanks
Finally found what the problem was.
Indeed the Public IP needs to be created in the node resource group because the ingress controller, with the loadBalancerIP assigned to the Public IP address, is going to look for it in the node resource group so if you create it in the resource group fails with the error I was getting.
The node resource group name is assigned at cluster creation eg. MC_myResourceGroup_myAKSCluster_eastus, but you can name it as you wish using the parameter node_resource_group = var.node_resource_group_name.
Also, the Public IP sku "Standard" (to be specified) or "Basic" ( default), and the cluster load_balancer_sku "standard" or "basic"(no default value her, it needs to be specified) have to match.
I also put the Public IP in the cluster module so it can depend on it, to avoid being created before it and failing as the node resource group has not been created yet, couldn't set that dependency correctly in main.tf file.
So the working configuration is now:
main
terraform {
required_version = ">=1.1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0.2"
}
}
}
provider "azurerm" {
features {
resource_group {
prevent_deletion_if_contains_resources = false
}
}
subscription_id = var.azure_subscription_id
tenant_id = var.azure_subscription_tenant_id
client_id = var.service_principal_appid
client_secret = var.service_principal_password
}
provider "kubernetes" {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
provider "helm" {
kubernetes {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
}
module "cluster" {
source = "./modules/cluster"
location = var.location
vm_size = var.vm_size
resource_group_name = var.resource_group_name
node_resource_group_name = var.node_resource_group_name
kubernetes_version = var.kubernetes_version
ssh_key = var.ssh_key
sp_client_id = var.service_principal_appid
sp_client_secret = var.service_principal_password
}
module "ingress-controller" {
source = "./modules/ingress-controller"
public_ip_address = module.cluster.public_ip_address
depends_on = [
module.cluster.public_ip_address
]
}
cluster
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "test"
Team = "DevOps"
}
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
### choose the resource goup to use for the cluster
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
### decide the name of the cluster "node" resource group, if unset will be named automatically
node_resource_group = var.node_resource_group_name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
vm_size = var.vm_size
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
}
service_principal {
client_id = var.sp_client_id
client_secret = var.sp_client_secret
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "basic"
}
http_application_routing_enabled = false
depends_on = [
azurerm_resource_group.resource_group
]
}
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
# resource_group_name = var.resource_group_name
resource_group_name = var.node_resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
# sku = "Standard"
depends_on = [
azurerm_kubernetes_cluster.server_cluster
]
}
ingress controller
resource "helm_release" "nginx" {
name = "ingress-nginx"
repository = "ingress-nginx"
chart = "ingress-nginx/ingress-nginx"
namespace = "default"
set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
value = "true"
}
set {
name = "controller.service.loadBalancerIP"
value = var.public_ip_address
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
}
}
ingress service
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-service
# namespace: default
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
spec:
ingressClassName: nginx
rules:
# - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP
### Node.js server
- http:
paths:
- path: /(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
- http:
paths:
- path: /server(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
...
other services omitted
Hope this can help others having difficulties in getting the setup right.
Cheers.
I have an AKS kubernetes cluster provisioned with terraform. And I need to enable the azure-keyvault-secrets-provider add-on.
Using the azure CLI, I could enable it as follows:
az aks enable-addons --addons azure-keyvault-secrets-provider --name myAKSCluster --resource-group myResourceGroup
But, how can I do it with the terraform? I tried the documentation, but doesn't mention anything about a secret driver except only one block as follows:
resource "azurerm_kubernetes_cluster" "k8s_cluster" {
lifecycle {
ignore_changes = [
default_node_pool
]
prevent_destroy = false
}
key_vault_secrets_provider {
secret_rotation_enabled = true
}
...
}
Is the above key_vault_secrets_provider doing the same thing as the azure CLI command az aks enable-addons --addons azure-keyvault-secrets-provider --name myAKSCluster --resource-group myResourceGroup ?
Because according to the terraform documentation, this key_vault_secrets_provider block is only for rotating the keyvault secrets. But no mention about enabling the driver.
My requirement is to:
Enable the secret provider driver
Create a kubernetes Secret -> so it will provision the secret in azure
Inject the secret to a kubernetes Deployment
I have tried to check the same in my environment:
Code: Without key_vault_secrets_provider
main.tf:
resource "azurerm_kubernetes_cluster" "example" {
name = "kavyaexample-aks1"
location = data.azurerm_resource_group.example.location
resource_group_name = data.azurerm_resource_group.example.name
dns_prefix = "kavyaexampleaks1"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Production"
}
}
output "client_certificate" {
value = azurerm_kubernetes_cluster.example.kube_config.0.client_certificate
sensitive = true
}
When checked the available addons list for my managed aks cluster through CLI , the “azure-keyvault-secrets-provider" is shown as disabled .It means for the latest versions of terraform provider , they have providers .Just that it need to be enabled.
Command:
az aks addon list –name kavyaexample-aks1 --resource-group <myrg>
Now checked after adding key_vault_secrets_provider block with secret rotation enabled.
Main.tf:
resource "azurerm_kubernetes_cluster" "example" {
name = "kavyaexample-aks1"
location = data.azurerm_resource_group.example.location
resource_group_name = data.azurerm_resource_group.example.name
dns_prefix = "cffggf"
....
key_vault_secrets_provider {
secret_rotation_enabled = true
}
default_node_pool {
name = ”dfgdf”
...
}
When checked for addon list using the same command:
az aks addon list –name kavyaexample-aks1 --resource-group <myrg>
The azure keyvault secret provider addon is being enabled.
which means adding key_vault_secrets_provider block with secret
rotation enabled itself means , we are making use of the azure
keyvault secret provider addon.
Also check this terraform-azurerm-aks issue on addon_profile being deprecated in latest terraform versions |github
I'm attempting to upgrade the default node pool k8s version using Terraform's azurerm_kubernetes_cluster, but it's telling me there is nothing to change. Here are the relevant details:
tfstate file (redacted, pulled from Terraform Cloud):
{
"version": 4,
"terraform_version": "1.0.2",
"resources": [
{
"mode": "managed",
"type": "azurerm_kubernetes_cluster",
"name": "aks",
"provider": "provider[\"registry.terraform.io/hashicorp/azurerm\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"default_node_pool": [
{
"orchestrator_version": "1.19.7", // tfstate shows 1.19.7
"type": "VirtualMachineScaleSets"
}
],
"kubernetes_version": "1.20.7"
}
}
]
}
]
}
Resource definition:
resource "azurerm_kubernetes_cluster" "aks" {
.
.
.
kubernetes_version = local.aks.kubernetes_version
default_node_pool {
name = local.aks.default_node_pool.name
vm_size = local.aks.default_node_pool.vm_size
node_count = local.aks.default_node_pool.node_count
orchestrator_version = local.aks.default_node_pool.orchestrator_version
}
.
.
.
}
Variables:
variable "aks" {
description = "Kubernetes Service values"
type = object({
name = string
dns_prefix = optional(string)
kubernetes_version = optional(string)
default_node_pool = object({
name = optional(string)
vm_size = optional(string)
node_count = optional(number)
orchestrator_version = optional(string)
})
identity_type = optional(string)
})
}
locals {
aks = defaults(var.aks, {
dns_prefix = var.aks.name
kubernetes_version = "1.20.7"
default_node_pool = {
name = "agentpool"
vm_size = "Standard_D4s_v3"
node_count = 3
orchestrator_version = "1.20.7"
}
})
}
Specifying kubernetes_version worked to upgrade the k8s control plan within Azure, but the default node pool did not update. After a bit more research I found the orchestrator_version, and to my understanding this should upgrade the k8s version for the virtual machine scale set.
When I run terraform plan, it tells me "Your infrastructure matches the configuration", even though tfstate clearly shows orchestrator_version as "1.19.7", and my variable (I've tried hard coding as well) is set to "1.20.7".
Why isn't Terraform / azurerm_kubernetes_cluster recognizing the orchestrator_version change?
Can you please check using Terraform azurerm provider version 2.14.0? This release has a fix for this issue. Reference
A known workaround is to use a null_resource with a trigger on the Kubernetes version and running a Bash script with the local-exec provisioner. So, you do not have to upgrade the node pools manually.
...
resource "null_resource" "aks" {
triggers = {
aks_kubernetes_version = azurerm_kubernetes_cluster.aks.kubernetes_version
}
provisioner "local-exec" {
command = "./cluster-upgrade-fix.sh ${var.name} ${azurerm_resource_group.aks.name}"
working_dir = path.module
}
}
...
The script compares the Kubernetes version of the control plane and node pools. Differs the version it upgrades each node pool in the cluster accordingly.
cluster-upgrade-fix.sh :
#!/bin/bash
CLUSTER_NAME=$1
RESOURCE_GROUP=$2
CLUSTER_VERSION=$(az aks show --name $CLUSTER_NAME --resource-group $RESOURCE_GROUP | jq -r .kubernetesVersion)
NODE_POOLS=$(az aks nodepool list --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --query '[].name' -o tsv)
for NODE_POOL in $NODE_POOLS; do
NODE_VERSION=$(az aks nodepool show --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --name $NODE_POOL | jq -r .orchestratorVersion)
if [[ $CLUSTER_VERSION != $NODE_VERSION ]]; then
az aks nodepool upgrade --kubernetes-version $CLUSTER_VERSION --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --name $NODE_POOL --verbose
fi
done
References:
https://www.danielstechblog.io/terraform-upgrading-aks-kubernetes-version-does-not-upgrade-node-pools/
https://dennis-riemenschneider.medium.com/how-to-upgrade-the-kubernetes-version-of-your-aks-default-node-pool-with-terraform-b7311d76041b