I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
subnet {
name = "appgwsubnet"
address_prefix = var.app_gateway_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
resource "azurerm_role_assignment" "ra1" {
scope = data.azurerm_subnet.kubesubnet.id
role_definition_name = "Network Contributor"
principal_id = local.client_objectid
depends_on = [data.azurerm_subnet.kubesubnet]
}
and followed the below steps to install the ISTIO as per the ISTIO documentation
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
and deployed the following app and service
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
labels:
app: httpbin
service: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
serviceAccountName: httpbin
containers:
- image: docker.io/kennethreitz/httpbin
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
and added the secrets like mentioned below
kubectl create -n istio-system secret tls httpbin-credential --key=example_certs1/httpbin.example.com.key --cert=example_certs1/httpbin.example.com.crt
and Gateway installed like mentioned below
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: mygateway
spec:
selector:
istio: ingress # use istio default ingress gateway
servers:
- port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: httpbin-credential # must be the same as secret
hosts:
- httpbin.example.com
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
spec:
hosts:
- "httpbin.example.com"
gateways:
- mygateway
http:
- match:
- uri:
prefix: /status
- uri:
prefix: /delay
route:
- destination:
port:
number: 8000
host: httpbin
and updated the host file like mentioned below
However I can't access the Application
Note: I can access an another application that doesn't use SSL
Related
I got the following setup:
Azure Kubernetes -> Ingress-Nginx-Controller (uses Azure Load-Balancer) -> External DNS
I am exposing the Ingress-Nginx-Controller via an Ingress, backed by the Azure Load Balancer Controller using public IP.
I have used the following terraform code to setup the Azure Kubernetes
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_user_assigned_identity" "aks_external_dns_managed_id" {
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
name = "aks-external-dns-managed-id"
depends_on = [
azurerm_resource_group.rg
]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
resource "azurerm_role_assignment" "ra5" {
scope = data.azurerm_subnet.kubesubnet.id
role_definition_name = "Network Contributor"
principal_id = azurerm_user_assigned_identity.aks_external_dns_managed_id.principal_id
depends_on = [azurerm_kubernetes_cluster.k8s,data.azurerm_subnet.kubesubnet]
}
resource "azurerm_role_assignment" "ra6" {
scope = module.container_registry.0.id
role_definition_name = "AcrPush"
principal_id = azurerm_user_assigned_identity.aks_external_dns_managed_id.principal_id
depends_on = [azurerm_kubernetes_cluster.k8s,module.container_registry]
}
resource "azurerm_role_assignment" "ra7" {
scope = azurerm_resource_group.rg.id
role_definition_name = "Reader"
principal_id = azurerm_user_assigned_identity.aks_external_dns_managed_id.principal_id
depends_on = [azurerm_kubernetes_cluster.k8s]
}
resource "azurerm_role_assignment" "ra8" {
scope = azurerm_dns_zone.dns_zone.id
role_definition_name = "Contributor"
principal_id = azurerm_user_assigned_identity.aks_external_dns_managed_id.principal_id
depends_on = [azurerm_kubernetes_cluster.k8s,]
}
and following details are displayed
output "subscriptionId" {
value = "${data.azurerm_client_config.current.subscription_id}"
}
output "resource_group_name" {
value = azurerm_resource_group.rg.name
}
output "aks_external_dns_managed_id" {
value = azurerm_user_assigned_identity.aks_external_dns_managed_id.client_id
}
and added the user managed identity as a user assigned identity in the AKS VMSS
and executed the following commands
Install Nginx Ingress Controller
kubectl create namespace ingress-basic
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ingress-nginx ingress-nginx/ingress-nginx --namespace ingress-basic
Update the details in the azure.json
{
"tenantId": "tenant ID GUID",
"subscriptionId": "subscription ID GUID",
"resourceGroup": "rg-blessed-baboon",
"useManagedIdentityExtension": true,
"userAssignedIdentityID": "1a8e6a4b-2d84-447d-8be5-efd1e95e2dda"
}
and created the secret
kubectl create secret generic azure-config-file --from-file=azure.json
and applied the
kubectl apply -f external-dns.yml
external-dns.yml
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services","endpoints","pods", "nodes"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: k8s.gcr.io/external-dns/external-dns:v0.13.2
args:
- --source=service
- --source=ingress
#- --domain-filter=example.com # (optional) limit to only example.com domains; change to match the zone created above.
- --provider=azure
#- --azure-resource-group=externaldns # (optional) use the DNS zones from the specific resource group
volumeMounts:
- name: azure-config-file
mountPath: /etc/kubernetes
readOnly: true
volumes:
- name: azure-config-file
secret:
secretName: azure-config-file
and Deployed the application
kubectl apply -f .\deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app1-nginx-deployment
labels:
app: app1-nginx
spec:
replicas: 1
selector:
matchLabels:
app: app1-nginx
template:
metadata:
labels:
app: app1-nginx
spec:
containers:
- name: app1-nginx
image: stacksimplify/kube-nginxapp1:1.0.0
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: app1-nginx-clusterip-service
labels:
app: app1-nginx
spec:
type: ClusterIP
selector:
app: app1-nginx
ports:
- port: 80
targetPort: 80
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginxapp1-ingress-service
annotations:
kubernetes.io/ingress.class: "nginx"
spec:
rules:
- host: eapp1.eat-eggs.ca
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app1-nginx-clusterip-service
port:
number: 80
---
It got deployed and I can access the application using cluster IP
DNS did not get updated even after 20 mins after the deployment.
External DNS throws the following error
kubectl logs -f external-dns-765645d59d-d2bsp
Logs:
time="2023-02-11T02:28:30Z" level=error
msg="azure.BearerAuthorizer#WithAuthorization: Failed to refresh the
Token for request to
https://management.azure.com/subscriptions//resourceGroups/rg-blessed-baboon/providers/Microsoft.Network/dnsZones?api-version=2018-05-01:
StatusCode=404 -- Original Error: adal: Refresh request failed. Status
Code = '404'. Response body: clientID in request: 1a8e##### REDACTED
#####2dda, getting assigned identities for pod default/external-dns-765645d59d-d2bsp in CREATED state failed after 16
attempts, retry duration [5]s, error: . Check MIC pod logs for
identity assignment errors\n Endpoint
http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=1a8e6a4b-2d84-447d-8be5-efd1e95e2dda&resource=https%3A%2F%2Fmanagement.core.windows.net%2F"
What am I missing?
Update#1
I have replaced the external DNS file with the below
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-dns
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services","endpoints","pods", "nodes"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns
namespace: ingress-basic
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns
template:
metadata:
labels:
app: external-dns
spec:
serviceAccountName: external-dns
containers:
- name: external-dns
image: registry.k8s.io/external-dns/external-dns:v0.13.2
args:
- --source=service
- --source=ingress
#- --domain-filter=example.com # (optional) limit to only example.com domains; change to match the zone created above.
- --provider=azure
#- --azure-resource-group=MyDnsResourceGroup # (optional) use the DNS zones from the tutorial's resource group
#- --txt-prefix=externaldns-
volumeMounts:
- name: azure-config-file
mountPath: /etc/kubernetes
readOnly: true
volumes:
- name: azure-config-file
secret:
secretName: azure-config-file
and tried with both Client ID & Object (principal) ID. Issue is still persist.
you didnt attach managed identity created for external dns to aks nodes. normally you have aad pod identity controller handling that (and security around that), but you can simply go to the aks managed group click on the vmss and assign managed identity for external dns to the nodepool
I'm very new to Cloud providers and I'm trying to setup a cluster on Azure.
I setup the infra with terraform, and I create a ResourceGroup, a static Azure PublicIP (sku basic), a Azure Load Balancer(sku basic, connected to the public ip, front end port: 80, backend port 3000) and a AKS cluster.
With Skaffold I then deploy on the AKS cluster a ClusterIP ( port: 3000, targetPort: 3000 that selects the server pods), the server Deployment( which listens to port 3000) plus the secrets.
The deployment goes well and logging the server pods shows the app is running correctly and listening on por 3000, but when I try to access the server either with the address (20.218.249.246:80/api where 80 is the load balancer front end port and the/apiis the base for the router) or the dns (fixit.germanywestcentral.cloudapp.azure.com:80/api) from the Azure console the connection fails after timeout.
I deployed a Kubernetes Load balancer to test the cluster and from its external ip I can indeed access the server.
looking in the troubleshot guides I see that the check on the cluster's load balancer passes as it uses a Standard load balancer, but I did create a Basic one in terraform.
load balance check
public ip
load balancer
cluster
resource group
It would seem by the check that my cluster is not using the load balancer I created as the one it is using is Standard and the one I created is Basic.
Am I missing out something in setting up the cluster or the ip on Azure?
Many thanks for the help, here are the files
resource group
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "Production"
Team = "DevOps"
}
}
public ip
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
resource_group_name = var.resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
# sku = "Standard"
# fixit.germanywestcentral.cloudapp.azure.com
}
load balancer
resource "azurerm_lb" "load-balancer" {
name = "fixit-load-balancer"
location = var.location
resource_group_name = var.resource_group_name
# sku = "Standard"
frontend_ip_configuration {
name = "PublicIPAddress"
public_ip_address_id = azurerm_public_ip.public-ip.id
}
}
resource "azurerm_lb_backend_address_pool" "address-pool" {
name = "fixit-backend-pool"
loadbalancer_id = azurerm_lb.load-balancer.id
}
resource "azurerm_lb_rule" "load-balancer-rule" {
name = "fixit-load-balancer-rule"
loadbalancer_id = azurerm_lb.load-balancer.id
frontend_ip_configuration_name = "PublicIPAddress"
protocol = "Tcp"
frontend_port = 80
# backend_port = 27017
backend_port = 3000
# disable_outbound_snat = true
}
cluster
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
# vm_size = "standard_b2s_v5"
vm_size = "standard_e2bs_v5"
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
enable_node_public_ip = true
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
# load_balancer_sku = "standard"
}
}
cluster ip
apiVersion: v1
kind: Service
metadata:
name: server-clusterip-service
spec:
type: ClusterIP cloud provider's load balancer
selector:
app: fixit-server-pod
ports:
- name: server-clusterip-service
protocol: TCP
port: 3000 # service port
targetPort: 3000 # por on which the app is listening to
server
apiVersion: apps/v1
kind: Deployment
metadata:
name: fixit-server
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: fixit-server-pod
template:
metadata:
labels:
app: fixit-server-pod
spec:
imagePullSecrets:
- name: docker-secret
containers:
- name: fixit-server-container
image: vinnytwice/fixit-server:dev
imagePullPolicy: 'Always'
env:
# - name: SERVER_DEV_IMAGE_TAG
# value: 'dev'
# server
- name: APP_LISTENING_PORT
value: '3000'
- name: API_KEY
valueFrom:
secretKeyRef:
name: server-secret
key: api-key
# stripe
- name: STRIPE_KEY
valueFrom:
secretKeyRef:
name: stripe-secret
key: stripe-key
# mongo db connection string
- name: MONGO_USERNAME_K8S
valueFrom:
secretKeyRef:
key: mongo-username-k8s
name: server-secret
- name: MONGO_HOSTNAME_K8S
valueFrom:
secretKeyRef:
key: mongo-hostname-k8s
name: server-secret
- name: MONGO_PORT_K8S
valueFrom:
secretKeyRef:
name: server-secret
key: mongo-port-k8s
# neo4j connection string
- name: MONGO_DB_K8S
valueFrom:
secretKeyRef:
key: mongo-db-k8s
name: server-secret
- name: NEO4J_AURA_URI
valueFrom:
secretKeyRef:
key: neo4j-aura-uri
name: neo4j-secret
- name: NEO4J_AURA_USERNAME
valueFrom:
secretKeyRef:
key: neo4j-aura-username
name: neo4j-secret
- name: NEO4J_AURA_PASSWORD
valueFrom:
secretKeyRef:
key: neo4j-aura-password
name: neo4j-secret
resources:
limits:
memory: '2Gi'
cpu: '500m'
# cpu: '1.0'
I finally found the problem. It was my setup of course. I wasn't installing any ingress controller so of course the ingress service wasn't working at all, the proper implementation is to install an ingress controller https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli, which hooks up the the internal cluster load balancer, so there is no need to create one. Now, the ingress controller will accept an ip address for the load balancer, but you have to create the PublicIP in the node resource group because is going to look for it there and not in the resource group, got it after checking the difference between the two here https://learn.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks.
So the working configuration is now:
main
terraform {
required_version = ">=1.1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0.2"
}
}
}
provider "azurerm" {
features {
resource_group {
prevent_deletion_if_contains_resources = false
}
}
subscription_id = var.azure_subscription_id
tenant_id = var.azure_subscription_tenant_id
client_id = var.service_principal_appid
client_secret = var.service_principal_password
}
provider "kubernetes" {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
provider "helm" {
kubernetes {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
}
module "cluster" {
source = "./modules/cluster"
location = var.location
vm_size = var.vm_size
resource_group_name = var.resource_group_name
node_resource_group_name = var.node_resource_group_name
kubernetes_version = var.kubernetes_version
ssh_key = var.ssh_key
sp_client_id = var.service_principal_appid
sp_client_secret = var.service_principal_password
}
module "ingress-controller" {
source = "./modules/ingress-controller"
public_ip_address = module.cluster.public_ip_address
depends_on = [
module.cluster.public_ip_address
]
}
cluster
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "test"
Team = "DevOps"
}
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
### choose the resource goup to use for the cluster
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
### decide the name of the cluster "node" resource group, if unset will be named automatically
node_resource_group = var.node_resource_group_name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
vm_size = var.vm_size
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
}
service_principal {
client_id = var.sp_client_id
client_secret = var.sp_client_secret
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "basic"
}
http_application_routing_enabled = false
depends_on = [
azurerm_resource_group.resource_group
]
}
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
# resource_group_name = var.resource_group_name
resource_group_name = var.node_resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
# sku = "Standard"
depends_on = [
azurerm_kubernetes_cluster.server_cluster
]
}
ingress controller
resource "helm_release" "nginx" {
name = "ingress-nginx"
repository = "ingress-nginx"
chart = "ingress-nginx/ingress-nginx"
namespace = "default"
set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
value = "true"
}
set {
name = "controller.service.loadBalancerIP"
value = var.public_ip_address
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
}
}
ingress service
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-service
# namespace: default
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
spec:
ingressClassName: nginx
rules:
# - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP
### Node.js server
- http:
paths:
- path: /(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
- http:
paths:
- path: /server(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
...
other services omitted
Hope this can help getting the setup right.
Cheers.
It's now quite a few days that I'm trying to configure the cluster on AKS but I keep jumping between parts of the docs, various questions here on SO, articles on Medium.. all to keep failing at it.
The goal is get a static ip with a dns that I can use to connect my apps to the server deployed on AKS.
I have created via terraform the infrastructure which consists of a resource group in which I created a Public IP and the AKS cluster, so far so good.
After trying to use the ingress controller that gets installed when you use the option http_application_routing_enabled = true on cluster creation which the docs are discouraging for production https://learn.microsoft.com/en-us/azure/aks/http-application-routing, I'm trying the recommended way and install the ingress-nginx controller via Helm https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli.
In terraform I'm installing it all like this
resource group and cluster
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "Test"
Team = "DevOps"
}
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
# vm_size = "standard_b2s_v5"
# vm_size = "standard_e2bs_v5"
vm_size = "standard_b4ms"
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
# enable_node_public_ip = true
}
service_principal {
client_id = var.sp_client_id
client_secret = var.sp_client_secret
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "standard"
# load_balancer_sku = "basic"
}
# http_application_routing_enabled = true
http_application_routing_enabled = false
}
public ip
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
resource_group_name = var.resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
sku = "Standard"
}
load balancer
resource "kubernetes_service" "cluster-ingress" {
metadata {
name = "cluster-ingress-svc"
annotations = {
"service.beta.kubernetes.io/azure-load-balancer-resource-group" = "fixit-resource-group"
# Warning SyncLoadBalancerFailed 2m38s (x8 over 12m) service-controller Error syncing load balancer:
# failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236
# in resource group MC_fixit-resource-group_server_cluster_westeurope
# "service.beta.kubernetes.io/azure-load-balancer-resource-group" = "MC_fixit-resource-group_server_cluster_westeurope"
# kubernetes.io/ingress.class: addon-http-application-routing
}
}
spec {
# type = "Ingress"
type = "LoadBalancer"
load_balancer_ip = var.public_ip_address
selector = {
name = "cluster-ingress-svc"
}
port {
name = "cluster-port"
protocol = "TCP"
port = 3000
target_port = "80"
}
}
}
ingress controller
resource "helm_release" "nginx" {
name = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
namespace = "default"
set {
name = "rbac.create"
value = "false"
}
set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
set {
name = "controller.service.loadBalancerIP"
value = var.public_ip_address
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
value = "true"
}
# --set controller.service.annotations."service\.beta\.kubernetes\.io/azure-load-balancer-health-probe-request-path"=/healthz
set {
name = "controller.service.annotations.service\\.beta\\.kubernetes\\.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
}
}
but the installation fails with this message from terraform
Warning: Helm release "ingress-nginx" was created but has a failed status. Use the `helm` command to investigate the error, correct it, then run Terraform again.
│
│ with module.ingress_controller.helm_release.nginx,
│ on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│ 2: resource "helm_release" "nginx" {
│
╵
╷
│ Error: timed out waiting for the condition
│
│ with module.ingress_controller.helm_release.nginx,
│ on modules/ingress_controller/controller.tf line 2, in resource "helm_release" "nginx":
│ 2: resource "helm_release" "nginx" {
the controller print out
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: default
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.2
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: default
service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.173.243
IPs: 10.0.173.243
IP: 52.157.90.236
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31709/TCP
Endpoints:
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 30045/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 32500
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 32s (x5 over 108s) service-controller Ensuring load balancer
Warning SyncLoadBalancerFailed 31s (x5 over 107s) service-controller Error syncing load balancer: failed to ensure load balancer: findMatchedPIPByLoadBalancerIP: cannot find public IP with IP address 52.157.90.236 in resource group mc_fixit-resource-group_server_cluster_westeurope
vincenzocalia#vincenzos-MacBook-Air helm_charts % az aks show --resource-group fixit-resource-group --name server_cluster --query nodeResourceGroup -o tsv
MC_fixit-resource-group_server_cluster_westeurope
Why is it looking in the MC_fixit-resource-group_server_cluster_westeurope resource group and not in the fixit-resource-group I created for the Cluster, Public IP and Load Balancer?
If I change the controller load balancer ip to the public ip in MC_fixit-resource-group_server_cluster_westeurope then terraform still outputs the same error, but the controller prints out to be correctly assigned to the ip and load balancer
set {
name = "controller.service.loadBalancerIP"
value = "20.73.192.77" #var.public_ip_address
}
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cluster-ingress-svc LoadBalancer 10.0.110.114 52.157.90.236 3000:31863/TCP 104m
ingress-nginx-controller LoadBalancer 10.0.106.201 20.73.192.77 80:30714/TCP,443:32737/TCP 41m
ingress-nginx-controller-admission ClusterIP 10.0.23.188 <none> 443/TCP 41m
kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 122m
vincenzocalia#vincenzos-MacBook-Air helm_charts % kubectl describe svc ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: default
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.5.1
helm.sh/chart=ingress-nginx-4.4.2
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: default
service: map[beta:map[kubernetes:map[io/azure-load-balancer-internal:true]]]
service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path: /healthz
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.106.201
IPs: 10.0.106.201
IP: 20.73.192.77
LoadBalancer Ingress: 20.73.192.77
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 30714/TCP
Endpoints:
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 32737/TCP
Endpoints:
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 32538
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 39m (x2 over 41m) service-controller Ensuring load balancer
Normal EnsuredLoadBalancer 39m (x2 over 41m) service-controller Ensured load balancer
vincenzocalia#vincenzos-MacBook-Air helm_charts %
Reading here https://learn.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks
To enable this architecture, each AKS deployment spans two resource groups:
You create the first resource group. This group contains only the Kubernetes service resource. The AKS resource provider automatically creates the second resource group during deployment. An example of the second resource group is MC_myResourceGroup_myAKSCluster_eastus. For information on how to specify the name of this second resource group, see the next section.
The second resource group, known as the node resource group, contains all of the infrastructure resources associated with the cluster. These resources include the Kubernetes node VMs, virtual networking, and storage. By default, the node resource group has a name like MC_myResourceGroup_myAKSCluster_eastus. AKS automatically deletes the node resource group whenever the cluster is deleted, so it should only be used for resources that share the cluster's lifecycle.
Should I pass the first or the second group depending of what kind of resource I'm creating?
E.g. kubernetes_service needs 1st rg, while azurerm_public_ip needs the 2nd rg?
What is it that I'm missing out here?
Please explain it like I was 5 years old because I'm feeling like right now..
Many thanks
Finally found what the problem was.
Indeed the Public IP needs to be created in the node resource group because the ingress controller, with the loadBalancerIP assigned to the Public IP address, is going to look for it in the node resource group so if you create it in the resource group fails with the error I was getting.
The node resource group name is assigned at cluster creation eg. MC_myResourceGroup_myAKSCluster_eastus, but you can name it as you wish using the parameter node_resource_group = var.node_resource_group_name.
Also, the Public IP sku "Standard" (to be specified) or "Basic" ( default), and the cluster load_balancer_sku "standard" or "basic"(no default value her, it needs to be specified) have to match.
I also put the Public IP in the cluster module so it can depend on it, to avoid being created before it and failing as the node resource group has not been created yet, couldn't set that dependency correctly in main.tf file.
So the working configuration is now:
main
terraform {
required_version = ">=1.1.0"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0.2"
}
}
}
provider "azurerm" {
features {
resource_group {
prevent_deletion_if_contains_resources = false
}
}
subscription_id = var.azure_subscription_id
tenant_id = var.azure_subscription_tenant_id
client_id = var.service_principal_appid
client_secret = var.service_principal_password
}
provider "kubernetes" {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
provider "helm" {
kubernetes {
host = "${module.cluster.host}"
client_certificate = "${base64decode(module.cluster.client_certificate)}"
client_key = "${base64decode(module.cluster.client_key)}"
cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
}
}
module "cluster" {
source = "./modules/cluster"
location = var.location
vm_size = var.vm_size
resource_group_name = var.resource_group_name
node_resource_group_name = var.node_resource_group_name
kubernetes_version = var.kubernetes_version
ssh_key = var.ssh_key
sp_client_id = var.service_principal_appid
sp_client_secret = var.service_principal_password
}
module "ingress-controller" {
source = "./modules/ingress-controller"
public_ip_address = module.cluster.public_ip_address
depends_on = [
module.cluster.public_ip_address
]
}
cluster
resource "azurerm_resource_group" "resource_group" {
name = var.resource_group_name
location = var.location
tags = {
Environment = "test"
Team = "DevOps"
}
}
resource "azurerm_kubernetes_cluster" "server_cluster" {
name = "server_cluster"
### choose the resource goup to use for the cluster
location = azurerm_resource_group.resource_group.location
resource_group_name = azurerm_resource_group.resource_group.name
### decide the name of the cluster "node" resource group, if unset will be named automatically
node_resource_group = var.node_resource_group_name
dns_prefix = "fixit"
kubernetes_version = var.kubernetes_version
# sku_tier = "Paid"
default_node_pool {
name = "default"
node_count = 1
min_count = 1
max_count = 3
vm_size = var.vm_size
type = "VirtualMachineScaleSets"
enable_auto_scaling = true
enable_host_encryption = false
# os_disk_size_gb = 30
}
service_principal {
client_id = var.sp_client_id
client_secret = var.sp_client_secret
}
tags = {
Environment = "Production"
}
linux_profile {
admin_username = "azureuser"
ssh_key {
key_data = var.ssh_key
}
}
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "basic"
}
http_application_routing_enabled = false
depends_on = [
azurerm_resource_group.resource_group
]
}
resource "azurerm_public_ip" "public-ip" {
name = "fixit-public-ip"
location = var.location
# resource_group_name = var.resource_group_name
resource_group_name = var.node_resource_group_name
allocation_method = "Static"
domain_name_label = "fixit"
# sku = "Standard"
depends_on = [
azurerm_kubernetes_cluster.server_cluster
]
}
ingress controller
resource "helm_release" "nginx" {
name = "ingress-nginx"
repository = "ingress-nginx"
chart = "ingress-nginx/ingress-nginx"
namespace = "default"
set {
name = "controller.service.externalTrafficPolicy"
value = "Local"
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
value = "true"
}
set {
name = "controller.service.loadBalancerIP"
value = var.public_ip_address
}
set {
name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path"
value = "/healthz"
}
}
ingress service
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-service
# namespace: default
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/use-regex: "true"
nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
spec:
ingressClassName: nginx
rules:
# - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP
### Node.js server
- http:
paths:
- path: /(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
- http:
paths:
- path: /server(/|$)(.*)
pathType: Prefix
backend:
service:
name: server-clusterip-service
port:
number: 80
...
other services omitted
Hope this can help others having difficulties in getting the setup right.
Cheers.
I've created a Kubernetes cluster in Azure using the following Terraform. As you see clearly, I have passed the AppGateway ID to ingress_application_gateway.
# Create the Azure Kubernetes Service (AKS) Cluster
resource "azurerm_kubernetes_cluster" "kubernetes_cluster" {
count = var.enable_kubernetes == true ? 1 : 0
name = "aks-prjx-${var.subscription_type}-${var.environment}-${var.location}-${var.instance_number}"
location = var.location
resource_group_name = module.resource_group_kubernetes_cluster[0].name # "rg-aks-spoke-dev-westus3-001"
dns_prefix = "dns-aks-prjx-${var.subscription_type}-${var.environment}-${var.location}-${var.instance_number}" #"dns-prjxcluster"
private_cluster_enabled = false
local_account_disabled = true
default_node_pool {
name = "npprjx${var.subscription_type}" #"prjxsyspool" # NOTE: "name must start with a lowercase letter, have max length of 12, and only have characters a-z0-9."
vm_size = "Standard_B8ms"
vnet_subnet_id = data.azurerm_subnet.aks-subnet.id
# zones = ["1", "2", "3"]
enable_auto_scaling = true
max_count = 3
min_count = 1
# node_count = 3
os_disk_size_gb = 50
type = "VirtualMachineScaleSets"
enable_node_public_ip = false
enable_host_encryption = false
node_labels = {
"node_pool_type" = "npprjx${var.subscription_type}"
"node_pool_os" = "linux"
"environment" = "${var.environment}"
"app" = "prjx_${var.subscription_type}_app"
}
tags = var.tags
}
ingress_application_gateway {
gateway_id = azurerm_application_gateway.network.id
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = true
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = true #false
}
network_profile {
network_plugin = "azure"
network_policy = "azure"
outbound_type = "userDefinedRouting"
}
identity {
type = "SystemAssigned"
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [
azurerm_application_gateway.network
]
}
I was thinking that AppGateway will be used as the Ingress Gateway. However, AKS creates the Azure Load Balancer while trying to deploy the Service like mentioned below
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: aks-helloworld-two
Is there a reason for this Load Balancer and AppGateway not being used? I would assume that Load balancer is used for type LoadBalancer and App Gateway is used for Ingress
I tried to reproduce the same in my environment to create a Service with Application Gateway Ingress Controller:
Since you mentioned service type: Load balancer in your yaml file, it is creating load balancing service,
Inorder to create service with Application Gateway Ingress Controller without an associated Load Balancer, kindly follow the below steps.
1.First, you need to create a deployment for your application with the desired replicas and the desired image.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-app
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
name: test-app
labels:
app: nginx
spec:
containers:
- name: nginx
image: "nginx:latest"
ports:
- containerPort: 80
cmd to check the deployed app kubectl get deploy
2 Next, you can create a service for your application with type:ClusterIP.
Note: if you create a service type:LoadBalancer for your application. This service will create with Load Balancer.
apiVersion: v1
kind: Service
metadata:
name: nginx-service
labels:
app: nginx
spec:
selector:
app: nginx
ports:
- port: 80
targetPort: 80
protocol: TCP
Service created with type: ClusterIP.
cmd to check the service: kubectl get svc
Create an ingress resource for your application to route the traffic to the service
.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginxapp
annotations:
kubernetes.io/ingress.class: azure/application-gateway
spec:
rules:
- http:
paths:
- pathType: Exact
path: /
backend:
service:
name: nginx-service
port:
number: 80
Ingress created successfully.
Once all resources are created Application Gateway Ingress Controller will route the traffic to the service without creating an associated load balancer.
Application is running successfully with application gateway public IP.
I have configured an AKS cluster through terraform. It deploys as standard an external load balancer configured with a backend pool pointing at the default pool's VM Scale Set.
I now would like to configure a second (internal) Load Balancer with a backend pool pointing at that same VM Scale Set. Is this possible? If so, how do I get a reference to that scale set? And how do I attach the load balancer to the scale set?
Config of the load balancer:
resource "azurerm_lb" "aks-internal-lb" {
name = "${local.resource_prefix}-internal-lb"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
sku = "Standard"
frontend_ip_configuration {
name = "InternalIPAddress"
private_ip_address = var.aks_internal_lb_ip
private_ip_address_allocation = "Static"
subnet_id = data.terraform_remote_state.net.outputs.aks_subnet_id
}
}
resource "azurerm_lb_backend_address_pool" "aks-internal-lb-be-pool" {
loadbalancer_id = azurerm_lb.aks-internal-lb.id
name = "InternalBackEndAddressPool"
}
The corresponding aks config:
resource "azurerm_kubernetes_cluster" "k8s" {
name = "${local.resource_prefix}-k8s"
location = azurerm_resource_group.aks_rg.location
resource_group_name = azurerm_resource_group.aks_rg.name
dns_prefix = local.resource_prefix
private_dns_zone_id = "System"
private_cluster_enabled = true
default_node_pool {
name = "defaultpool"
node_count = 3
vm_size = "Standard_D2s_v3"
vnet_subnet_id = data.terraform_remote_state.net.outputs.aks_subnet_id
availability_zones = [ 1, 2, 3 ]
max_pods = 110
}
identity {
type = "SystemAssigned"
}
network_profile {
network_plugin = "azure"
}
}
What is the purpose of this LoadBalancer? Do you want to use it for the ingress-controller? If yes, you cant use na existing LB created with Terraform.
If you create a Service inside AKS it will automatically create an LoadBalancer for you in the Node Resource Group if you specify type: LoadBalancer:
External Load Balancer:
spec:
type: LoadBalancer
loadBalancerIP: 53.1.1.1
Internal Load Balancer:
metadata:
name: internal-app
annotations:
service.beta.kubernetes.io/azure-load-balancer-internal: "true"
# If you use any different Subnet for the Ingress, add this:
service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "apps-subnet"
spec:
type: LoadBalancer
loadBalancerIP: 10.240.0.25
Here is the documentation: External-LB and Internal-LB.