Terraform AzureRM domain-join fails due to account lockout - azure

I'm creating terraform configuration files to rapidly create and destory demo environments for our prospective customers. These environments are pretty simple, containing a few VMs in single vnet with a single subnet, some for management, some for apps, and one as an AVD session host.
I have seen this work perfectly well a handfull of times, but most of the time it fails during the VM domain-join. When I troubleshoot the issue it's always because the account being used for the doman-join is locked out. I have confirmed this by connecting to the VM via the bastion and manually attempting the domain-join.
Here's my config to create the admin account used for domain join:
resource "azuread_group" "dc_admins" {
display_name = "AAD DC Administrators"
security_enabled = true
}
resource "azuread_user" "admin" {
user_principal_name = join("#", [var.admin_username, var.onmicrosoft_domain])
display_name = var.admin_username
password = var.admin_password
depends_on = [
azuread_group.dc_admins
]
}
resource "azuread_group_member" "admin" {
group_object_id = azuread_group.dc_admins.object_id
member_object_id = azuread_user.admin.object_id
depends_on = [
azuread_group.dc_admins,
azuread_user.admin
]
}
Here's my domain-join config:
resource "azurerm_virtual_machine_extension" "domain_join_mgmt_devices" {
name = "join-domain"
virtual_machine_id = azurerm_windows_virtual_machine.mgmtvm[count.index].id
publisher = "Microsoft.Compute"
type = "JsonADDomainExtension"
type_handler_version = "1.0"
depends_on = [
azurerm_windows_virtual_machine.mgmtvm,
azurerm_virtual_machine_extension.install_rsat_tools
]
count = "${var.vm_count}"
settings = <<SETTINGS
{
"Name": "${var.onmicrosoft_domain}",
"OUPath": "OU=AADDC Computers,DC=hidden,DC=onmicrosoft,DC=com",
"User": "${var.onmicrosoft_domain}\\${var.admin_username}",
"Restart": "true",
"Options": "3"
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"Password": "${var.admin_password}"
}
PROTECTED_SETTINGS
}
Here's the console output for the plan:
Terraform will perform the following actions:
# azurerm_virtual_machine_extension.domain_join_mgmt_devices[0] will be created
+ resource "azurerm_virtual_machine_extension" "domain_join_mgmt_devices" {
+ id = (known after apply)
+ name = "join-domain"
+ protected_settings = (sensitive value)
+ publisher = "Microsoft.Compute"
+ settings = jsonencode(
{
+ Name = "hidden.onmicrosoft.com"
+ OUPath = "OU=AADDC Computers,DC=hidden,DC=onmicrosoft,DC=com"
+ Options = "3"
+ Restart = "true"
+ User = "hidden.onmicrosoft.com\\admin_username"
}
)
+ type = "JsonADDomainExtension"
+ type_handler_version = "1.0"
+ virtual_machine_id = "/subscriptions/hiddensubid/resourceGroups/demo-rg/providers/Microsoft.Compute/virtualMachines/mgmt-vm1"
}
Plan: 1 to add, 0 to change, 0 to destroy.
I cannot for the life of my figure out what is locking the account. It's a fresh account, in a fresh subscription, created by the Terraform configuration prior to being used for the domain-join action.
Has anyone else seen anything like this?
Am I missing some knowledge about AAD, AD DS, cloud-only managed
domains?
I create the AAD DC Administrative user before creating the AAD DS managed domain. Could this be an issue?
Should I wait X minutes before creating an admin account and using it for administrative actions?
Is it possible, using Terraform and AzureRM, to prevent an AAD account from being locked out?

Related

Is there a way in terraform to create a replacement group of related resources before destroying the original group?

I have a VM template I'm deploying an Azure Virtual Desktop environment with terraform (via octopus deploy) to Azure. On top of the Virtual Machines, I'm installing a number of extensions which culminates with a vm extension to register the VM with the Host Pool.
I'd like to rebuild the VM each time the custom script extension is applied (Extension #2, after domain join). But in rebuilding the VM, I'd like to build out a new VM, complete with the host pool registration before any part of the existing VM is destroyed.
Please accept the cut down version below to understand what I am trying to do.
I expect the largest number of machine recreations to come from enhancements to the configuration scripts that configure the server on creation. Not all of the commands are expected to be idempotent and we want the AVD vms to be ephemeral. If an issue is encountered, the support team is expected to be able to drain a server and destroy it once empty to get a replacement by terraform apply. In a case where the script gets updated though, we want to be able to replace all VMs quickly in an emergency, or at the very least minimize the nightly maintenance window.
Script Process: parameterized script > gets filled out as a template file > gets stored as an az blob > called by custom script extension > executed on the machine.
VM build process: VM is provisioned > currently 8 extensions get applied one at a time, starting with the domain join, then the custom script extension, followed by several Azure monitoring extensions, and finally the host pool registration extension.
I've been trying to use the create_before_destroy lifecycle feature, but I can't get it to spin up the VM, and apply all extensions before it begins removing the hostpool registration from the existing VMs. I assume there's a way to do it using the triggers, but I'm not sure how to do it in such a way that it always has at least the current number of VMs.
It would also need to be able to stop if it encounters an error on the new vm, before destroying the existing vm (or better yet, be authorized to rebuild VMs if an extension fails part way through).
resource "random_pet" "avd_vm" {
prefix = var.client_name
length = 1
keepers = {
# Generate a new pet name each time we update the setup_host script
source_content = "${data.template_file.setup_host.rendered}"
}
}
data "template_file" "setup_host" {
template = file("${path.module}\\scripts\\setup-host.tpl")
vars = {
storageAccountName = azurerm_storage_account.storage.name
storageAccountKey = azurerm_storage_account.storage.primary_access_key
domain = var.domain
aad_group_name = var.aad_group_name
}
}
resource "azurerm_storage_blob" "setup_host" {
name = "setup-host.ps1"
storage_account_name = azurerm_storage_account.scripts.name
storage_container_name = time_sleep.container_rbac.triggers["name"]
type = "Block"
source_content = data.template_file.setup_host.rendered #"${path.module}\\scripts\\setup-host.ps1"
depends_on = [
azurerm_role_assignment.account1_write,
data.template_file.setup_host,
time_sleep.container_rbac
]
}
data "template_file" "client_r_drive_mapping" {
template = file("${path.module}\\scripts\\client_r_drive_mapping.tpl")
vars = {
storageAccountName = azurerm_storage_account.storage.name
storageAccountKey = azurerm_storage_account.storage.primary_access_key
}
}
resource "azurerm_windows_virtual_machine" "example" {
count = length(random_pet.avd_vm)
name = "${random_pet.avd_vm[count.index].id}"
...
lifecycle {
ignore_changes = [
boot_diagnostics,
identity
]
}
}
resource "azurerm_virtual_machine_extension" "first-domain_join_extension" {
count = var.rdsh_count
name = "${var.client_name}-avd-${random_pet.avd_vm[count.index].id}-domainJoin"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Compute"
type = "JsonADDomainExtension"
type_handler_version = "1.3"
auto_upgrade_minor_version = true
settings = <<SETTINGS
{
"Name": "${var.domain_name}",
"OUPath": "${var.ou_path}",
"User": "${var.domain_user_upn}#${var.domain_name}",
"Restart": "true",
"Options": "3"
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"Password": "${var.admin_password}"
}
PROTECTED_SETTINGS
lifecycle {
ignore_changes = [settings, protected_settings]
}
depends_on = [
azurerm_virtual_network_peering.out-primary,
azurerm_virtual_network_peering.in-primary,
azurerm_virtual_network_peering.in-secondary
]
}
# Multiple scripts called by ./<scriptname referencing them in follow-up scripts
# https://web.archive.org/web/20220127015539/https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-windows
# https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-windows#using-multiple-scripts
resource "azurerm_virtual_machine_extension" "second-custom_scripts" {
count = var.rdsh_count
name = "${random_pet.avd_vm[count.index].id}-setup-host"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Compute"
type = "CustomScriptExtension"
type_handler_version = "1.10"
auto_upgrade_minor_version = "true"
protected_settings = <<PROTECTED_SETTINGS
{
"storageAccountName": "${azurerm_storage_account.scripts.name}",
"storageAccountKey": "${azurerm_storage_account.scripts.primary_access_key}"
}
PROTECTED_SETTINGS
settings = <<SETTINGS
{
"fileUris": ["https://${azurerm_storage_account.scripts.name}.blob.core.windows.net/scripts/setup-host.ps1","https://${azurerm_storage_account.scripts.name}.blob.core.windows.net/scripts/client_r_drive_mapping.ps1"],
"commandToExecute": "powershell -ExecutionPolicy Unrestricted -file setup-host.ps1"
}
SETTINGS
depends_on = [
azurerm_virtual_machine_extension.first-domain_join_extension,
azurerm_storage_blob.setup_host
]
}
resource "azurerm_virtual_machine_extension" "last_host_extension_hp_registration" {
count = var.rdsh_count
name = "${var.client_name}-${random_pet.avd_vm[count.index].id}-avd_dsc"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Powershell"
type = "DSC"
type_handler_version = "2.73"
auto_upgrade_minor_version = true
settings = <<-SETTINGS
{
"modulesUrl": "https://wvdportalstorageblob.blob.core.windows.net/galleryartifacts/Configuration_3-10-2021.zip",
"configurationFunction": "Configuration.ps1\\AddSessionHost",
"properties": {
"HostPoolName":"${azurerm_virtual_desktop_host_pool.pooleddepthfirst.name}"
}
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"properties": {
"registrationInfoToken": "${azurerm_virtual_desktop_host_pool_registration_info.pooleddepthfirst.token}"
}
}
PROTECTED_SETTINGS
lifecycle {
ignore_changes = [settings, protected_settings]
}
depends_on = [
azurerm_virtual_machine_extension.second-custom_scripts
]
}

Terraform: Error when creating azure kubernetes service with local_account_disabled=true

An error occurs when I try to create a AKS with Terraform. The AKS was created but the error still comes at the end, which is ugly.
│ Error: retrieving Access Profile for Cluster: (Managed Cluster Name
"aks-1" / Resource Group "pengine-aks-rg"):
containerservice.ManagedClustersClient#GetAccessProfile: Failure responding to request:
StatusCode=400 -- Original Error: autorest/azure: Service returned an error. Status=400
Code="BadRequest" Message="Getting static credential is not allowed because this cluster
is set to disable local accounts."
This is my terraform code:
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "=2.96.0"
}
}
}
resource "azurerm_resource_group" "aks-rg" {
name = "aks-rg"
location = "West Europe"
}
resource "azurerm_kubernetes_cluster" "aks-1" {
name = "aks-1"
location = azurerm_resource_group.aks-rg.location
resource_group_name = azurerm_resource_group.aks-rg.name
dns_prefix = "aks1"
local_account_disabled = "true"
default_node_pool {
name = "nodepool1"
node_count = 3
vm_size = "Standard_D2_v2"
}
identity {
type = "SystemAssigned"
}
tags = {
Environment = "Test"
}
}
Is this a Terraform bug? Can I avoid the error?
If you disable local accounts you need to activate AKS-managed Azure Active Directory integration as you have no more local accounts to authenticate against AKS.
This example enables RBAC, Azure AAD & Azure RBAC:
resource "azurerm_kubernetes_cluster" "aks-1" {
...
role_based_access_control {
enabled = true
azure_active_directory {
managed = true
tenant_id = data.azurerm_client_config.current.tenant_id
admin_group_object_ids = ["OBJECT_IDS_OF_ADMIN_GROUPS"]
azure_rbac_enabled = true
}
}
}
If you dont want AAD integration you need set local_account_disabled = "false".

Configure Desktop + devices in Azure AD using Terraform

I was not able to find the creation of (Desktop + devices) platform resource feature of terraform (azuread_application) also attached screen shot for the azure portal.
**Mobile & desktop Platform screen shot **
Terraform code:
# AAD AKS kubectl app
resource "azuread_application" "aks-aad-client" {
display_name = local.app_name
sign_in_audience = "AzureADMultipleOrgs"
web {
redirect_uris = ["https://login.microsoftonline.com/common/oauth2/nativeclient"]
implicit_grant {
access_token_issuance_enabled = true
}
}
required_resource_access {
resource_app_id = "xxxxxxxxxxxxxxxxxxxxxxxxxxxx"
resource_access {
id = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
type = "Scope"
}
}
app_role {
allowed_member_types = ["User"]
description = "Admins can manage roles and perform all task actions"
display_name = "user"
enabled = true
value = "Task"
}
}
There are multiple options is available.But How can I configured the (Desktop + devices) platform?
If it's possible please provide az command for (desktop + devices) creation or terraform code.
Regards,
Nataraj.R
You can create a Web App using the below sample script from Github.
GitHub: https://github.com/kumarvna/terraform-azurerm-app-service
And in default App setting block you can add another setting as
MobileAppsManagement_EXTENSION_VERSION = "Latest"
As per the Microsoft documentation for creating a mobile app.
Update:
I could add the platform using terraform by adding a reply_uris parameter in the azuread_ application resource block and it got added.
provider "azuread" {
}
# Create an application
resource "azuread_application" "example" {
name ="Your azuread app display name"
reply_urls = ["https://login.microsoftonline.com/common/oauth2/nativeclient","https://login.live.com/oauth20_desktop.srf"]
}
after its added in portal its showing as web but it has the same configuration as mobile and desktop application uri's.
If I used "type=native" parameter then it working fine as expected. It type parameter going to expired soon.
https://registry.terraform.io/providers/hashicorp/azuread/latest/docs/resources/application
Note: By defaults to webapp/api
# AAD client app
resource "azuread_application" "aks-aad-client" {
display_name = local.app_name
prevent_duplicate_names = true
sign_in_audience = "AzureADMultipleOrgs"
type = "native"
web {
redirect_uris = ["https://login.microsoftonline.com/common/oauth2/nativeclient"]
implicit_grant {
access_token_issuance_enabled = true
}
}
required_resource_access {
resource_app_id = "xxxxxxxxxxxxxxxxxxxx"
resource_access {
id = "xxxxxxxxxxxxxxx"
type = "Scope"
}
}
app_role {
allowed_member_types = ["User"]
description = "Admins can manage roles and perform all task actions"
display_name = "user"
enabled = true
value = "Task"
}
}

Terraform-Azure: Grant Access to azure resource for group

Experts,
I have a situation where I have to grant access on multiple Azure resources to a particular group, and i have to do this using Terraform only.
example:
Azure Group Name: India-group (5-6 users is there in this group)
Azure Subscription name: India
Azure Resource SQL Database: SQL-db-1
Azure Resource Key-Vault: India-key-vlt-1
Azure Resource Storage Account: India-acnt-1
and many more like PostgreSQL, storage account, blob.....
I think you do not need to care about how does the resource group can access the resources. What you need to care about is how to access the resources when it's necessary.
Generally, we use the service principal that assign roles that contain appropriate permission to access the resources. You can take a look at What is role-based access control (RBAC) for Azure resources and Create a service principal via CLI.
In Terraform, I assume you want to get the secrets from the KeyVault. Here is an example:
provider "azurerm" {
features {}
}
resource "azuread_application" "example" {
name = "example"
homepage = "http://homepage"
identifier_uris = ["http://uri"]
reply_urls = ["http://replyurl"]
available_to_other_tenants = false
oauth2_allow_implicit_flow = true
}
resource "azuread_service_principal" "example" {
application_id = azuread_application.example.application_id
app_role_assignment_required = false
tags = ["example", "tags", "here"]
}
resource "azurerm_resource_group" "example" {
name = "resourceGroup1"
location = "West US"
}
resource "azurerm_key_vault" "example" {
name = "testvault"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
enabled_for_disk_encryption = true
tenant_id = var.tenant_id
soft_delete_enabled = true
purge_protection_enabled = false
sku_name = "standard"
access_policy {
tenant_id = var.tenant_id
object_id = azuread_service_principal.example.object_id
key_permissions = [
"get",
]
secret_permissions = [
"get",
]
storage_permissions = [
"get",
]
}
network_acls {
default_action = "Deny"
bypass = "AzureServices"
}
tags = {
environment = "Testing"
}
}
Then you can access the key vault to get the secrets or keys through the service principal. You can also take a look at the example that controls Key Vault via python.
For other resources, you need to learn about the resource itself first, and then you can know how to access it in a suitable way. Finally, you can use Terraform to achieve it.

How to configure an Azure app service to pull images from an ACR with terraform?

I have the following terraform module to setup app services under the same plan:
provider "azurerm" {
}
variable "env" {
type = string
description = "The SDLC environment (qa, dev, prod, etc...)"
}
variable "appsvc_names" {
type = list(string)
description = "The names of the app services to create under the same app service plan"
}
locals {
location = "eastus2"
resource_group_name = "app505-dfpg-${var.env}-web-${local.location}"
acr_name = "app505dfpgnedeploycr88836"
}
resource "azurerm_app_service_plan" "asp" {
name = "${local.resource_group_name}-asp"
location = local.location
resource_group_name = local.resource_group_name
kind = "Linux"
reserved = true
sku {
tier = "Basic"
size = "B1"
}
}
resource "azurerm_app_service" "appsvc" {
for_each = toset(var.appsvc_names)
name = "${local.resource_group_name}-${each.value}-appsvc"
location = local.location
resource_group_name = local.resource_group_name
app_service_plan_id = azurerm_app_service_plan.asp.id
site_config {
linux_fx_version = "DOCKER|${local.acr_name}/${each.value}:latest"
}
app_settings = {
DOCKER_REGISTRY_SERVER_URL = "https://${local.acr_name}.azurecr.io"
}
}
output "hostnames" {
value = {
for appsvc in azurerm_app_service.appsvc: appsvc.name => appsvc.default_site_hostname
}
}
I am invoking it through the following configuration:
terraform {
backend "azurerm" {
}
}
locals {
appsvc_names = ["gateway"]
}
module "web" {
source = "../../modules/web"
env = "qa"
appsvc_names = local.appsvc_names
}
output "hostnames" {
description = "The hostnames of the created app services"
value = module.web.hostnames
}
The container registry has the images I need:
C:\> az acr login --name app505dfpgnedeploycr88836
Login Succeeded
C:\> az acr repository list --name app505dfpgnedeploycr88836
[
"gateway"
]
C:\> az acr repository show-tags --name app505dfpgnedeploycr88836 --repository gateway
[
"latest"
]
C:\>
When I apply the terraform configuration everything is created fine, but inspecting the created app service resource in Azure Portal reveals that its Container Settings show no docker image:
Now, I can manually switch to another ACR and then back to the one I want only to get this:
Cannot perform credential operations for /subscriptions/0f1c414a-a389-47df-aab8-a351876ecd47/resourceGroups/app505-dfpg-ne-deploy-eastus2/providers/Microsoft.ContainerRegistry/registries/app505dfpgnedeploycr88836 as admin user is disabled. Kindly enable admin user as per docs: https://learn.microsoft.com/en-us/azure/container-registry/container-registry-authentication#admin-account
This is confusing me. According to https://learn.microsoft.com/en-us/azure/container-registry/container-registry-authentication#admin-account the admin user should not be used and so my ACR does not have one. On the other hand, I understand that I need somehow configure the app service to authenticate with the ACR.
What is the right way to do it then?
So this is now possible since the v2.71 version of the Azure RM provider. A couple of things have to happen...
Assign a Managed Identity to the application (can also use User Assigned but a bit more work)
Set the site_config.acr_use_managed_identity_credentials property to true
Grant the application's identity ACRPull rights on the container.
Below is a modified version of the code above, not tested but should be ok
provider "azurerm" {
}
variable "env" {
type = string
description = "The SDLC environment (qa, dev, prod, etc...)"
}
variable "appsvc_names" {
type = list(string)
description = "The names of the app services to create under the same app service plan"
}
locals {
location = "eastus2"
resource_group_name = "app505-dfpg-${var.env}-web-${local.location}"
acr_name = "app505dfpgnedeploycr88836"
}
resource "azurerm_app_service_plan" "asp" {
name = "${local.resource_group_name}-asp"
location = local.location
resource_group_name = local.resource_group_name
kind = "Linux"
reserved = true
sku {
tier = "Basic"
size = "B1"
}
}
resource "azurerm_app_service" "appsvc" {
for_each = toset(var.appsvc_names)
name = "${local.resource_group_name}-${each.value}-appsvc"
location = local.location
resource_group_name = local.resource_group_name
app_service_plan_id = azurerm_app_service_plan.asp.id
site_config {
linux_fx_version = "DOCKER|${local.acr_name}/${each.value}:latest"
acr_use_managed_identity_credentials = true
}
app_settings = {
DOCKER_REGISTRY_SERVER_URL = "https://${local.acr_name}.azurecr.io"
}
identity {
type = "SystemAssigned"
}
}
data "azurerm_container_registry" "this" {
name = local.acr_name
resource_group_name = local.resource_group_name
}
resource "azurerm_role_assignment" "acr" {
for_each = azurerm_app_service.appsvc
role_definition_name = "AcrPull"
scope = azurerm_container_registry.this.id
principal_id = each.value.identity[0].principal_id
}
output "hostnames" {
value = {
for appsvc in azurerm_app_service.appsvc: appsvc.name => appsvc.default_site_hostname
}
}
EDITED 21 Dec 2021
The MS documentation issue regarding the value being reset by Azure has now been resolved and you can also control Managed Identity via the portal.
So you can use service principal auth with App Service, but you'd have to create service principal grant it ACRpull permissions over the registry and use service principal login\password in App Service site_config
DOCKER_REGISTRY_SERVER_USERNAME
DOCKER_REGISTRY_SERVER_PASSWORD

Resources