Terraform: error when adding Diagnostic setting to Azure App Service - terraform

See below configuration I am using to add a diagonstic setting to send App service logs to a Log analytics workspace.
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_diagnostic_setting
resource "azurerm_app_service" "webapp" {
for_each = local.apsvc_map_with_locations
name = "${var.regional_web_rg[each.value.location].name}-${each.value.apsvc_name}-apsvc"
location = each.value.location
resource_group_name = var.regional_web_rg[each.value.location].name
app_service_plan_id = azurerm_app_service_plan.asp[each.value.location].id
https_only = true
identity {
type = "UserAssigned"
identity_ids = each.value.identity_ids
}
}
resource "azurerm_monitor_diagnostic_setting" "example" {
for_each = local.apsvc_map_with_locations
name = "example"
target_resource_id = "azurerm_app_service.webapp[${each.value.location}-${each.value.apsvc_name}].id"
log_analytics_workspace_id = data.terraform_remote_state.pod_bootstrap.outputs.pod_log_analytics_workspace.id
log {
category = "AuditEvent"
enabled = false
retention_policy {
enabled = false
}
}
metric {
category = "AllMetrics"
retention_policy {
enabled = false
}
}
}
Error:
Can not parse "target_resource_id" as a resource id: Cannot parse Azure ID: parse "azurerm_app_service.webapp[].id": invalid URI for request
2020-11-06T20:19:59.3344089Z
2020-11-06T20:19:59.3346016Z on .terraform\modules\web.web\pipeline\app\apsvc\app_hosting.tf line 127, in resource "azurerm_monitor_diagnostic_setting" "example":
2020-11-06T20:19:59.3346956Z 127: resource "azurerm_monitor_diagnostic_setting" "example" {
2020-11-06T20:19:59.3347091Z

You can try with the following:
target_resource_id = azurerm_app_service.webapp["${each.value.location}-${each.value.apsvc_name}"].id
instead of:
target_resource_id = "azurerm_app_service.webapp[${each.value.location}-${each.value.apsvc_name}].id"

Related

How to send the AKS application Cluster, Node, Pod, Container metrics to Log Analytics workspace so that it will be available in Azure Monitoring?

I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
# Create Log Analytics Workspace
module "log_analytics_workspace" {
source = "./modules/log_analytics_workspace"
count = var.enable_log_analytics_workspace == true ? 1 : 0
app_or_service_name = "log"
subscription_type = var.subscription_type
environment = var.environment
resource_group_name = azurerm_resource_group.rg.name
location = var.location
instance_number = var.instance_number
sku = var.log_analytics_workspace_sku
retention_in_days = var.log_analytics_workspace_retention_in_days
tags = var.tags
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
and I want to send the AKS application Cluster, Node, Pod, Container metrics to Log Analytics workspace so that it will be available in Azure Monitoring.
I have configured the diagnostic setting as mentioned below
resource "azurerm_monitor_diagnostic_setting" "aks_cluster" {
name = "${azurerm_kubernetes_cluster.k8s.name}-audit"
target_resource_id = azurerm_kubernetes_cluster.k8s.id
log_analytics_workspace_id = module.log_analytics_workspace[0].id
log {
category = "kube-apiserver"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-controller-manager"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "cluster-autoscaler"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-scheduler"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-audit"
enabled = true
retention_policy {
enabled = false
}
}
metric {
category = "AllMetrics"
enabled = false
retention_policy {
enabled = false
}
}
}
Is that all needed? I did come across an article where they were using azurerm_application_insights and I don't understand why azurerm_application_insights is needed to capture the cluster level metrics?
You do not need Application Insights, it really depends if you want application level monitoring.
This is probably want you read:
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/application_insights
"Manages an Application Insights component."
Application Insights provides complete monitoring of applications running on AKS and other environments.
https://learn.microsoft.com/en-us/azure/aks/monitor-aks#level-4--applications
According to good practice, you need to enable a few others:
guard should be enabled assuming you use AAD.
enable AllMetrics.
consider kube-audit-admin for reduced logging events.
consider csi-azuredisk-controller.
consider cloud-controller-manager for the cloud-node-manager component.
See more here:
https://learn.microsoft.com/en-us/azure/aks/monitor-aks#configure-monitoring
https://learn.microsoft.com/en-us/azure/aks/monitor-aks-reference

Pass one resource's variable to another

I am creating an Azure App Service resource and an App Registration resource (and app service and others that are not relevant to this question as they work fine) via Terraform.
resource "azurerm_app_service" "app" {
name = var.app_service_name
location = var.resource_group_location
resource_group_name = azurerm_resource_group.rg.name
app_service_plan_id = azurerm_app_service_plan.plan-app.id
app_settings = {
"AzureAd:ClientId" = azuread_application.appregistration.application_id
}
site_config {
ftps_state = var.app_service_ftps_state
}
}
resource "azuread_application" "appregistration" {
display_name = azurerm_app_service.app.name
owners = [data.azuread_client_config.current.object_id]
sign_in_audience = "AzureADMyOrg"
fallback_public_client_enabled = true
web {
homepage_url = var.appreg_web_homepage_url
logout_url = var.appreg_web_logout_url
redirect_uris = [var.appreg_web_homepage_url, var.appreg_web_redirect_uri]
implicit_grant {
access_token_issuance_enabled = true
id_token_issuance_enabled = true
}
}
}
output "appreg_application_id" {
value = azuread_application.appregistration.application_id
}
I need to add the App Registration client / application id to the app_settings block in the app service resource.
The error I get with the above configuration is:
{"#level":"error","#message":"Error: Cycle: azuread_application.appregistration, azurerm_app_service.app","#module":"terraform.ui","#timestamp":"2021-09-15T10:54:31.753401Z","diagnostic":{"severity":"error","summary":"Cycle: azuread_application.appregistration, azurerm_app_service.app","detail":""},"type":"diagnostic"}
Note that the output variable displays the application id correctly.
You have a cycle error because you have both resources referencing each other. Terraform builds a directed acyclical graph to work out which order to create (or destroy) resources in with the information from one resource or data source flowing into another normally determining this order.
In your case your azuread_application.appregistration resource is referencing the azurerm_app_service.app.name parameter while the azurerm_app_service.app resource needs the azuread_application.appregistration.application_id attribute.
I don't know a ton about Azure but to me that seems like the azurerm_app_service resource needs to be created ahead of the azuread_application resource and so I'd expect the link to be in that direction.
Because you are already setting the azurerm_app_service.app.name parameter to var.app_service_name then you can just directly pass var.app_service_name to azuread_application.appregistration.display_name to achieve the same result but to break the cycle error.
resource "azurerm_app_service" "app" {
name = var.app_service_name
location = var.resource_group_location
resource_group_name = azurerm_resource_group.rg.name
app_service_plan_id = azurerm_app_service_plan.plan-app.id
app_settings = {
"AzureAd:ClientId" = azuread_application.appregistration.application_id
}
site_config {
ftps_state = var.app_service_ftps_state
}
}
resource "azuread_application" "appregistration" {
display_name = var.app_service_name
owners = [data.azuread_client_config.current.object_id]
sign_in_audience = "AzureADMyOrg"
fallback_public_client_enabled = true
web {
homepage_url = var.appreg_web_homepage_url
logout_url = var.appreg_web_logout_url
redirect_uris = [var.appreg_web_homepage_url, var.appreg_web_redirect_uri]
implicit_grant {
access_token_issuance_enabled = true
id_token_issuance_enabled = true
}
}
}
output "appreg_application_id" {
value = azuread_application.appregistration.application_id
}

AKS via Terraform Error: Code="CustomRouteTableWithUnsupportedMSIType"

trying to create private aks via terraform using existing vnet and subnet, was able to create cluster suddenly below error came.
│ Error: creating Managed Kubernetes Cluster "demo-azwe-aks-cluster" (Resource Group "demo-azwe-aks-rg"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: Code="CustomRouteTableWithUnsupportedMSIType" Message="Clusters using managed identity type SystemAssigned do not support bringing your own route table. Please see https://aka.ms/aks/customrt for more information"
│
│ with azurerm_kubernetes_cluster.aks_cluster,
│ on aks_cluster.tf line 30, in resource "azurerm_kubernetes_cluster" "aks_cluster":
│ 30: resource "azurerm_kubernetes_cluster" "aks_cluster" {
# Provision AKS Cluster
resource "azurerm_kubernetes_cluster" "aks_cluster" {
name = "${var.global-prefix}-${var.cluster-id}-${var.environment}-azwe-aks-cluster"
location = "${var.location}"
resource_group_name = azurerm_resource_group.aks_rg.name
dns_prefix = "${var.global-prefix}-${var.cluster-id}-${var.environment}-azwe-aks-cluster"
kubernetes_version = data.azurerm_kubernetes_service_versions.current.latest_version
node_resource_group = "${var.global-prefix}-${var.cluster-id}-${var.environment}-azwe-aks-nrg"
private_cluster_enabled = true
default_node_pool {
name = "dpool"
vm_size = "Standard_DS2_v2"
orchestrator_version = data.azurerm_kubernetes_service_versions.current.latest_version
availability_zones = [1, 2, 3]
enable_auto_scaling = true
max_count = 2
min_count = 1
os_disk_size_gb = 30
type = "VirtualMachineScaleSets"
vnet_subnet_id = data.azurerm_subnet.aks.id
node_labels = {
"nodepool-type" = "system"
"environment" = "${var.environment}"
"nodepoolos" = "${var.nodepool-os}"
"app" = "system-apps"
}
tags = {
"nodepool-type" = "system"
"environment" = "dev"
"nodepoolos" = "linux"
"app" = "system-apps"
}
}
# Identity (System Assigned or Service Principal)
identity {
type = "SystemAssigned"
}
# Add On Profiles
addon_profile {
azure_policy {enabled = true}
oms_agent {
enabled = true
log_analytics_workspace_id = azurerm_log_analytics_workspace.insights.id
}
}
# Create Azure AD Group in Active Directory for AKS Admins
resource "azuread_group" "aks_administrators" {
name = "${azurerm_resource_group.aks_rg.name}-cluster-administrators"
description = "Azure AKS Kubernetes administrators for the ${azurerm_resource_group.aks_rg.name}-cluster."
}
RBAC and Azure AD Integration Block
role_based_access_control {
enabled = true
azure_active_directory {
managed = true
admin_group_object_ids = [azuread_group.aks_administrators.id]
}
}
# Linux Profile
linux_profile {
admin_username = "ubuntu"
ssh_key {
key_data = file(var.ssh_public_key)
}
}
# Network Profile
network_profile {
network_plugin = "kubenet"
load_balancer_sku = "Standard"
}
tags = {
Environment = "prod"
}
}
You are trying to create a Private AKS cluster with existing Vnet and existing subnet for both AKS and firewall ,So as per the error "CustomRouteTableWithUnsupportedMSIType" you need a managed identity to create a route table and a role assigned to it i.e. Network Contributor.
Network profile will be azure instead of kubenet as you are using azure vnet and its subnet.
Add on's you can use as per your requirement but please ensure you have used data block for workspace otherwise you can directly give the resourceID. So, instead of
log_analytics_workspace_id = azurerm_log_analytics_workspace.insights.id
you can use
log_analytics_workspace_id = "/subscriptions/SubscriptionID/resourcegroups/resourcegroupname/providers/microsoft.operationalinsights/workspaces/workspacename"
Example to create private cluster with existng vnet and subnets (I haven't added add on's):
provider "azurerm" {
features {}
}
#resource group as this will be referred to in managed identity creation
data "azurerm_resource_group" "base" {
name = "resourcegroupname"
}
#exisiting vnet
data "azurerm_virtual_network" "base" {
name = "ansuman-vnet"
resource_group_name = data.azurerm_resource_group.base.name
}
#exisiting subnets
data "azurerm_subnet" "aks" {
name = "akssubnet"
resource_group_name = data.azurerm_resource_group.base.name
virtual_network_name = data.azurerm_virtual_network.base.name
}
data "azurerm_subnet" "firewall" {
name = "AzureFirewallSubnet"
resource_group_name = data.azurerm_resource_group.base.name
virtual_network_name = data.azurerm_virtual_network.base.name
}
#user assigned identity required to create route table
resource "azurerm_user_assigned_identity" "base" {
resource_group_name = data.azurerm_resource_group.base.name
location = data.azurerm_resource_group.base.location
name = "mi-name"
}
#role assignment required to create route table
resource "azurerm_role_assignment" "base" {
scope = data.azurerm_resource_group.base.id
role_definition_name = "Network Contributor"
principal_id = azurerm_user_assigned_identity.base.principal_id
}
#route table
resource "azurerm_route_table" "base" {
name = "rt-aksroutetable"
location = data.azurerm_resource_group.base.location
resource_group_name = data.azurerm_resource_group.base.name
}
#route
resource "azurerm_route" "base" {
name = "dg-aksroute"
resource_group_name = data.azurerm_resource_group.base.name
route_table_name = azurerm_route_table.base.name
address_prefix = "0.0.0.0/0"
next_hop_type = "VirtualAppliance"
next_hop_in_ip_address = azurerm_firewall.base.ip_configuration.0.private_ip_address
}
#route table association
resource "azurerm_subnet_route_table_association" "base" {
subnet_id = data.azurerm_subnet.aks.id
route_table_id = azurerm_route_table.base.id
}
#firewall
resource "azurerm_public_ip" "base" {
name = "pip-firewall"
location = data.azurerm_resource_group.base.location
resource_group_name = data.azurerm_resource_group.base.name
allocation_method = "Static"
sku = "Standard"
}
resource "azurerm_firewall" "base" {
name = "fw-akscluster"
location = data.azurerm_resource_group.base.location
resource_group_name = data.azurerm_resource_group.base.name
ip_configuration {
name = "ip-firewallakscluster"
subnet_id = data.azurerm_subnet.firewall.id
public_ip_address_id = azurerm_public_ip.base.id
}
}
#kubernetes_cluster
resource "azurerm_kubernetes_cluster" "base" {
name = "testakscluster"
location = data.azurerm_resource_group.base.location
resource_group_name = data.azurerm_resource_group.base.name
dns_prefix = "dns-testakscluster"
private_cluster_enabled = true
network_profile {
network_plugin = "azure"
outbound_type = "userDefinedRouting"
}
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
vnet_subnet_id = data.azurerm_subnet.aks.id
}
identity {
type = "UserAssigned"
user_assigned_identity_id = azurerm_user_assigned_identity.base.id
}
depends_on = [
azurerm_route.base,
azurerm_role_assignment.base
]
}
Output:
(Terraform Plan)
(Terraform Apply)
(Azure portal)
Note: Its bydefault that azure requires the subnet name for firewall to be AzureFirewallSubnet. If you are using subnet with any other name for firewall creation then it will error out. So, Please ensure to name the existing subnet to be used by firewall to be AzureFirewallSubnet.

How to send AKS master logs to eventhub using terraform?

How to send AKS master logs to eventhub using Azurerm terraform ? As Terraform only provides log analytics option only.
In order to send logs to Event Hub using terraform you need to create few resources :
Event Hub Namespace (azurerm_eventhub_namespace)
Event Hub (azurerm_eventhub)
Authorization Rule for an Event Hub Namespace (azurerm_eventhub_namespace_authorization_rule)
Diagnostic Setting for an existing Resource (azurerm_monitor_diagnostic_setting)
The following example based on this repo.
# Create the AKS cluster
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_kubernetes_cluster" "example" {
name = "example-aks1"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
dns_prefix = "exampleaks1"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Production"
}
}
# Create Event hub namespace
resource "azurerm_eventhub_namespace" "logging" {
name = "logging-eventhub"
location = "${azurerm_resource_group.example.location}"
resource_group_name = "${azurerm_resource_group.example.name}"
sku = "Standard"
capacity = 1
kafka_enabled = false
}
# Create Event hub
resource "azurerm_eventhub" "logging_aks" {
name = "logging-aks-eventhub"
namespace_name = "${azurerm_eventhub_namespace.logging.name}"
resource_group_name = "${azurerm_resource_group.example.name}"
partition_count = 2
message_retention = 1
}
# Create an authorization rule
resource "azurerm_eventhub_namespace_authorization_rule" "logging" {
name = "authorization_rule"
namespace_name = "${azurerm_eventhub_namespace.logging.name}"
resource_group_name = "${azurerm_resource_group.example.name}"
listen = true
send = true
manage = true
}
# Manages a Diagnostic Setting for an existing Resource
resource "azurerm_monitor_diagnostic_setting" "aks-logging" {
name = "diagnostic_aksl"
target_resource_id = "${azurerm_kubernetes_cluster.example.id}"
eventhub_name = "${azurerm_eventhub.logging_aks.name}"
eventhub_authorization_rule_id = "${azurerm_eventhub_namespace_authorization_rule.logging.id}"
log {
category = "kube-scheduler"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-controller-manager"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "cluster-autoscaler"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-audit"
enabled = true
retention_policy {
enabled = false
}
}
log {
category = "kube-apiserver"
enabled = true
retention_policy {
enabled = false
}
}
}

How to give permissions to AKS to access ACR via terraform?

Question and details
How can I allow a Kubernetes cluster in Azure to talk to an Azure Container Registry via terraform?
I want to load custom images from my Azure Container Registry. Unfortunately, I encounter a permissions error at the point where Kubernetes is supposed to download the image from the ACR.
What I have tried so far
My experiments without terraform (az cli)
It all works perfectly after I attach the acr to the aks via az cli:
az aks update -n myAKSCluster -g myResourceGroup --attach-acr <acrName>
My experiments with terraform
This is my terraform configuration; I have stripped some other stuff out. It works in itself.
terraform {
backend "azurerm" {
resource_group_name = "tf-state"
storage_account_name = "devopstfstate"
container_name = "tfstatetest"
key = "prod.terraform.tfstatetest"
}
}
provider "azurerm" {
}
provider "azuread" {
}
provider "random" {
}
# define the password
resource "random_string" "password" {
length = 32
special = true
}
# define the resource group
resource "azurerm_resource_group" "rg" {
name = "myrg"
location = "eastus2"
}
# define the app
resource "azuread_application" "tfapp" {
name = "mytfapp"
}
# define the service principal
resource "azuread_service_principal" "tfapp" {
application_id = azuread_application.tfapp.application_id
}
# define the service principal password
resource "azuread_service_principal_password" "tfapp" {
service_principal_id = azuread_service_principal.tfapp.id
end_date = "2020-12-31T09:00:00Z"
value = random_string.password.result
}
# define the container registry
resource "azurerm_container_registry" "acr" {
name = "mycontainerregistry2387987222"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
sku = "Basic"
admin_enabled = false
}
# define the kubernetes cluster
resource "azurerm_kubernetes_cluster" "mycluster" {
name = "myaks"
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
dns_prefix = "mycluster"
network_profile {
network_plugin = "azure"
}
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_B2s"
}
# Use the service principal created above
service_principal {
client_id = azuread_service_principal.tfapp.application_id
client_secret = azuread_service_principal_password.tfapp.value
}
tags = {
Environment = "demo"
}
windows_profile {
admin_username = "dingding"
admin_password = random_string.password.result
}
}
# define the windows node pool for kubernetes
resource "azurerm_kubernetes_cluster_node_pool" "winpool" {
name = "winp"
kubernetes_cluster_id = azurerm_kubernetes_cluster.mycluster.id
vm_size = "Standard_B2s"
node_count = 1
os_type = "Windows"
}
# define the kubernetes name space
resource "kubernetes_namespace" "namesp" {
metadata {
name = "namesp"
}
}
# Try to give permissions, to let the AKR access the ACR
resource "azurerm_role_assignment" "acrpull_role" {
scope = azurerm_container_registry.acr.id
role_definition_name = "AcrPull"
principal_id = azuread_service_principal.tfapp.object_id
skip_service_principal_aad_check = true
}
This code is adapted from https://github.com/terraform-providers/terraform-provider-azuread/issues/104.
Unfortunately, when I launch a container inside the kubernetes cluster, I receive an error message:
Failed to pull image "mycontainerregistry.azurecr.io/myunittests": [rpc error: code = Unknown desc = Error response from daemon: manifest for mycontainerregistry.azurecr.io/myunittests:latest not found: manifest unknown: manifest unknown, rpc error: code = Unknown desc = Error response from daemon: Get https://mycontainerregistry.azurecr.io/v2/myunittests/manifests/latest: unauthorized: authentication required]
Update / note:
When I run terraform apply with the above code, the creation of resources is interrupted:
azurerm_container_registry.acr: Creation complete after 18s [id=/subscriptions/000/resourceGroups/myrg/providers/Microsoft.ContainerRegistry/registries/mycontainerregistry2387987222]
azurerm_role_assignment.acrpull_role: Creating...
azuread_service_principal_password.tfapp: Still creating... [10s elapsed]
azuread_service_principal_password.tfapp: Creation complete after 12s [id=000/000]
azurerm_kubernetes_cluster.mycluster: Creating...
azurerm_role_assignment.acrpull_role: Creation complete after 8s [id=/subscriptions/000/resourceGroups/myrg/providers/Microsoft.ContainerRegistry/registries/mycontainerregistry2387987222/providers/Microsoft.Authorization/roleAssignments/000]
azurerm_kubernetes_cluster.mycluster: Still creating... [10s elapsed]
Error: Error creating Managed Kubernetes Cluster "myaks" (Resource Group "myrg"): containerservice.ManagedClustersClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code="ServicePrincipalNotFound" Message="Service principal clientID: 000 not found in Active Directory tenant 000, Please see https://aka.ms/aks-sp-help for more details."
on test.tf line 56, in resource "azurerm_kubernetes_cluster" "mycluster":
56: resource "azurerm_kubernetes_cluster" "mycluster" {
I think, however, that this is just because it takes a few minutes for the service principal to be created. When I run terraform apply again a few minutes later, it goes beyond that point without issues.
(I did up the answer above)
Just adding a simpler way where you don't need to create a service principal for anyone else that might need it.
resource "azurerm_kubernetes_cluster" "kubweb" {
name = local.cluster_web
location = local.rgloc
resource_group_name = local.rgname
dns_prefix = "${local.cluster_web}-dns"
kubernetes_version = local.kubversion
# used to group all the internal objects of this cluster
node_resource_group = "${local.cluster_web}-rg-node"
# azure will assign the id automatically
identity {
type = "SystemAssigned"
}
default_node_pool {
name = "nodepool1"
node_count = 4
vm_size = local.vm_size
orchestrator_version = local.kubversion
}
role_based_access_control {
enabled = true
}
addon_profile {
kube_dashboard {
enabled = true
}
}
tags = {
environment = local.env
}
}
resource "azurerm_container_registry" "acr" {
name = "acr1"
resource_group_name = local.rgname
location = local.rgloc
sku = "Standard"
admin_enabled = true
tags = {
environment = local.env
}
}
# add the role to the identity the kubernetes cluster was assigned
resource "azurerm_role_assignment" "kubweb_to_acr" {
scope = azurerm_container_registry.acr.id
role_definition_name = "AcrPull"
principal_id = azurerm_kubernetes_cluster.kubweb.kubelet_identity[0].object_id
}
This code worked for me.
resource "azuread_application" "aks_sp" {
name = "sp-aks-${local.cluster_name}"
}
resource "azuread_service_principal" "aks_sp" {
application_id = azuread_application.aks_sp.application_id
app_role_assignment_required = false
}
resource "azuread_service_principal_password" "aks_sp" {
service_principal_id = azuread_service_principal.aks_sp.id
value = random_string.aks_sp_password.result
end_date_relative = "8760h" # 1 year
lifecycle {
ignore_changes = [
value,
end_date_relative
]
}
}
resource "azuread_application_password" "aks_sp" {
application_object_id = azuread_application.aks_sp.id
value = random_string.aks_sp_secret.result
end_date_relative = "8760h" # 1 year
lifecycle {
ignore_changes = [
value,
end_date_relative
]
}
}
data "azurerm_container_registry" "pyp" {
name = var.container_registry_name
resource_group_name = var.container_registry_resource_group_name
}
resource "azurerm_role_assignment" "aks_sp_container_registry" {
scope = data.azurerm_container_registry.pyp.id
role_definition_name = "AcrPull"
principal_id = azuread_service_principal.aks_sp.object_id
}
# requires Azure Provider 1.37+
resource "azurerm_kubernetes_cluster" "pyp" {
name = local.cluster_name
location = azurerm_resource_group.pyp.location
resource_group_name = azurerm_resource_group.pyp.name
dns_prefix = local.env_name_nosymbols
kubernetes_version = local.kubernetes_version
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2s_v3"
os_disk_size_gb = 80
}
windows_profile {
admin_username = "winadm"
admin_password = random_string.windows_profile_password.result
}
network_profile {
network_plugin = "azure"
dns_service_ip = cidrhost(local.service_cidr, 10)
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = local.service_cidr
load_balancer_sku = "standard"
}
service_principal {
client_id = azuread_service_principal.aks_sp.application_id
client_secret = random_string.aks_sp_password.result
}
addon_profile {
oms_agent {
enabled = true
log_analytics_workspace_id = azurerm_log_analytics_workspace.pyp.id
}
}
tags = local.tags
}
source https://github.com/giuliov/pipeline-your-pipelines/tree/master/src/kubernetes/terraform
Just want to go into more depth as this was something I struggled with as-well.
The recommended approach is to use Managed Identities instead of Service Principal due to less overhead.
Create a Container Registry:
resource "azurerm_container_registry" "acr" {
name = "acr"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
sku = "Standard"
admin_enabled = false
}
Create a AKS Cluster, the code below creates the AKS Cluster with 2 Identities:
A System Assigned Identity which is assigned to the Control Plane.
A User Assigned Managed Identity is also automatically created and assigned to the Kubelet, notice I have no specific code for that as it happens automatically.
The Kubelet is the process which goes to the Container Registry to pull the image, thus we need to make sure this User Assigned Managed Identity has the AcrPull Role on the Container Registry.
resource "azurerm_kubernetes_cluster" "aks" {
name = "aks"
resource_group_name = azurerm_resource_group.rg.name
location = azurerm_resource_group.rg.location
dns_prefix = "aks"
node_resource_group = "aks-node"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_Ds2_v2"
enable_auto_scaling = false
type = "VirtualMachineScaleSets"
vnet_subnet_id = azurerm_subnet.aks_subnet.id
max_pods = 50
}
network_profile {
network_plugin = "azure"
load_balancer_sku = "Standard"
}
identity {
type = "SystemAssigned"
}
}
Create Role Assignment mentioned above to allow the User Assigned Managed Identity to Pull from the Container Registry.
resource "azurerm_role_assignment" "ra" {
principal_id = azurerm_kubernetes_cluster.aks.kubelet_identity[0].object_id
role_definition_name = "AcrPull"
scope = azurerm_container_registry.acr.id
skip_service_principal_aad_check = true
}
Hope that clears things up for you, as I have seen some confusion on the internet about the two identities created.
source: https://jimferrari.com/2022/02/09/attach-azure-container-registry-to-azure-kubernetes-service-terraform/
The Terraform documentation for the Azure Container Registry resource now keeps track of this, which should always be up to date.
https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/container_registry#example-usage-attaching-a-container-registry-to-a-kubernetes-cluster
resource "azurerm_resource_group" "example" {
name = "example-resources"
location = "West Europe"
}
resource "azurerm_container_registry" "example" {
name = "containerRegistry1"
resource_group_name = azurerm_resource_group.example.name
location = azurerm_resource_group.example.location
}
resource "azurerm_kubernetes_cluster" "example" {
name = "example-aks1"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
dns_prefix = "exampleaks1"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Production"
}
}
resource "azurerm_role_assignment" "example" {
principal_id = azurerm_kubernetes_cluster.example.kubelet_identity[0].object_id
role_definition_name = "AcrPull"
scope = azurerm_container_registry.example.id
skip_service_principal_aad_check = true
}

Resources