Is there a way in terraform to create a replacement group of related resources before destroying the original group? - azure

I have a VM template I'm deploying an Azure Virtual Desktop environment with terraform (via octopus deploy) to Azure. On top of the Virtual Machines, I'm installing a number of extensions which culminates with a vm extension to register the VM with the Host Pool.
I'd like to rebuild the VM each time the custom script extension is applied (Extension #2, after domain join). But in rebuilding the VM, I'd like to build out a new VM, complete with the host pool registration before any part of the existing VM is destroyed.
Please accept the cut down version below to understand what I am trying to do.
I expect the largest number of machine recreations to come from enhancements to the configuration scripts that configure the server on creation. Not all of the commands are expected to be idempotent and we want the AVD vms to be ephemeral. If an issue is encountered, the support team is expected to be able to drain a server and destroy it once empty to get a replacement by terraform apply. In a case where the script gets updated though, we want to be able to replace all VMs quickly in an emergency, or at the very least minimize the nightly maintenance window.
Script Process: parameterized script > gets filled out as a template file > gets stored as an az blob > called by custom script extension > executed on the machine.
VM build process: VM is provisioned > currently 8 extensions get applied one at a time, starting with the domain join, then the custom script extension, followed by several Azure monitoring extensions, and finally the host pool registration extension.
I've been trying to use the create_before_destroy lifecycle feature, but I can't get it to spin up the VM, and apply all extensions before it begins removing the hostpool registration from the existing VMs. I assume there's a way to do it using the triggers, but I'm not sure how to do it in such a way that it always has at least the current number of VMs.
It would also need to be able to stop if it encounters an error on the new vm, before destroying the existing vm (or better yet, be authorized to rebuild VMs if an extension fails part way through).
resource "random_pet" "avd_vm" {
prefix = var.client_name
length = 1
keepers = {
# Generate a new pet name each time we update the setup_host script
source_content = "${data.template_file.setup_host.rendered}"
}
}
data "template_file" "setup_host" {
template = file("${path.module}\\scripts\\setup-host.tpl")
vars = {
storageAccountName = azurerm_storage_account.storage.name
storageAccountKey = azurerm_storage_account.storage.primary_access_key
domain = var.domain
aad_group_name = var.aad_group_name
}
}
resource "azurerm_storage_blob" "setup_host" {
name = "setup-host.ps1"
storage_account_name = azurerm_storage_account.scripts.name
storage_container_name = time_sleep.container_rbac.triggers["name"]
type = "Block"
source_content = data.template_file.setup_host.rendered #"${path.module}\\scripts\\setup-host.ps1"
depends_on = [
azurerm_role_assignment.account1_write,
data.template_file.setup_host,
time_sleep.container_rbac
]
}
data "template_file" "client_r_drive_mapping" {
template = file("${path.module}\\scripts\\client_r_drive_mapping.tpl")
vars = {
storageAccountName = azurerm_storage_account.storage.name
storageAccountKey = azurerm_storage_account.storage.primary_access_key
}
}
resource "azurerm_windows_virtual_machine" "example" {
count = length(random_pet.avd_vm)
name = "${random_pet.avd_vm[count.index].id}"
...
lifecycle {
ignore_changes = [
boot_diagnostics,
identity
]
}
}
resource "azurerm_virtual_machine_extension" "first-domain_join_extension" {
count = var.rdsh_count
name = "${var.client_name}-avd-${random_pet.avd_vm[count.index].id}-domainJoin"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Compute"
type = "JsonADDomainExtension"
type_handler_version = "1.3"
auto_upgrade_minor_version = true
settings = <<SETTINGS
{
"Name": "${var.domain_name}",
"OUPath": "${var.ou_path}",
"User": "${var.domain_user_upn}#${var.domain_name}",
"Restart": "true",
"Options": "3"
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"Password": "${var.admin_password}"
}
PROTECTED_SETTINGS
lifecycle {
ignore_changes = [settings, protected_settings]
}
depends_on = [
azurerm_virtual_network_peering.out-primary,
azurerm_virtual_network_peering.in-primary,
azurerm_virtual_network_peering.in-secondary
]
}
# Multiple scripts called by ./<scriptname referencing them in follow-up scripts
# https://web.archive.org/web/20220127015539/https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-windows
# https://learn.microsoft.com/en-us/azure/virtual-machines/extensions/custom-script-windows#using-multiple-scripts
resource "azurerm_virtual_machine_extension" "second-custom_scripts" {
count = var.rdsh_count
name = "${random_pet.avd_vm[count.index].id}-setup-host"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Compute"
type = "CustomScriptExtension"
type_handler_version = "1.10"
auto_upgrade_minor_version = "true"
protected_settings = <<PROTECTED_SETTINGS
{
"storageAccountName": "${azurerm_storage_account.scripts.name}",
"storageAccountKey": "${azurerm_storage_account.scripts.primary_access_key}"
}
PROTECTED_SETTINGS
settings = <<SETTINGS
{
"fileUris": ["https://${azurerm_storage_account.scripts.name}.blob.core.windows.net/scripts/setup-host.ps1","https://${azurerm_storage_account.scripts.name}.blob.core.windows.net/scripts/client_r_drive_mapping.ps1"],
"commandToExecute": "powershell -ExecutionPolicy Unrestricted -file setup-host.ps1"
}
SETTINGS
depends_on = [
azurerm_virtual_machine_extension.first-domain_join_extension,
azurerm_storage_blob.setup_host
]
}
resource "azurerm_virtual_machine_extension" "last_host_extension_hp_registration" {
count = var.rdsh_count
name = "${var.client_name}-${random_pet.avd_vm[count.index].id}-avd_dsc"
virtual_machine_id = azurerm_windows_virtual_machine.avd_vm.*.id[count.index]
publisher = "Microsoft.Powershell"
type = "DSC"
type_handler_version = "2.73"
auto_upgrade_minor_version = true
settings = <<-SETTINGS
{
"modulesUrl": "https://wvdportalstorageblob.blob.core.windows.net/galleryartifacts/Configuration_3-10-2021.zip",
"configurationFunction": "Configuration.ps1\\AddSessionHost",
"properties": {
"HostPoolName":"${azurerm_virtual_desktop_host_pool.pooleddepthfirst.name}"
}
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"properties": {
"registrationInfoToken": "${azurerm_virtual_desktop_host_pool_registration_info.pooleddepthfirst.token}"
}
}
PROTECTED_SETTINGS
lifecycle {
ignore_changes = [settings, protected_settings]
}
depends_on = [
azurerm_virtual_machine_extension.second-custom_scripts
]
}

Related

Use terraform to add a VM to the new Azure Monitoring (without OMS Agent)

When I configure Azure Monitoring using the OMS solution for VMs with this answer Enable Azure Monitor for existing Virtual machines using terraform, I notice that this feature is being deprecated and Azure prefers you move to the new monitoring solution (Not using the log analytics agent).
Azure allows me to configure VM monitoring using this GUI, but I would like to do it using terraform.
Is there a particular setup I have to use in terraform to achieve this? (I am using a Linux VM btw)
Yes, that is correct. The omsagent has been marked as legacy and Azure now has a new monitoring agent called "Azure Monitor agent" . The solution given below is for Linux, Please check the Official Terraform docs for Windows machines.
We need three things to do the equal UI counterpart in Terraform.
azurerm_log_analytics_workspace
azurerm_monitor_data_collection_rule
azurerm_monitor_data_collection_rule_association
Below is the example code:
data "azurerm_virtual_machine" "vm" {
name = var.vm_name
resource_group_name = var.az_resource_group_name
}
resource "azurerm_log_analytics_workspace" "workspace" {
name = "${var.project}-${var.env}-log-analytics"
location = var.az_location
resource_group_name = var.az_resource_group_name
sku = "PerGB2018"
retention_in_days = 30
}
resource "azurerm_virtual_machine_extension" "AzureMonitorLinuxAgent" {
name = "AzureMonitorLinuxAgent"
publisher = "Microsoft.Azure.Monitor"
type = "AzureMonitorLinuxAgent"
type_handler_version = 1.0
auto_upgrade_minor_version = "true"
virtual_machine_id = data.azurerm_virtual_machine.vm.id
}
resource "azurerm_monitor_data_collection_rule" "example" {
name = "example-rules"
resource_group_name = var.az_resource_group_name
location = var.az_location
destinations {
log_analytics {
workspace_resource_id = azurerm_log_analytics_workspace.workspace.id
name = "test-destination-log"
}
azure_monitor_metrics {
name = "test-destination-metrics"
}
}
data_flow {
streams = ["Microsoft-InsightsMetrics"]
destinations = ["test-destination-log"]
}
data_sources {
performance_counter {
streams = ["Microsoft-InsightsMetrics"]
sampling_frequency_in_seconds = 60
counter_specifiers = ["\\VmInsights\\DetailedMetrics"]
name = "VMInsightsPerfCounters"
}
}
}
# associate to a Data Collection Rule
resource "azurerm_monitor_data_collection_rule_association" "example1" {
name = "example1-dcra"
target_resource_id = data.azurerm_virtual_machine.vm.id
data_collection_rule_id = azurerm_monitor_data_collection_rule.example.id
description = "example"
}
Reference:
monitor_data_collection_rule
monitor_data_collection_rule_association

SentinelOne LinuxExtension - Azure

I am currently looking to deploy the SentinelOne agent via Terraform. There does not appear to be much documentation online for VM extension usage in terms of Terraform. Has anyone successfully deployed the S1 agent via Terraform extension? I am unclear on what to add to the settings/protected_settings blocks. Any help is appreciated.
"azurerm_virtual_machine_extension" "example" {
name = "hostname"
virtual_machine_id = azurerm_virtual_machine.example.id
publisher = "SentinelOne.LinuxExtension"
type = "LinuxExtension"
type_handler_version = "1.0"
To add to the settings/protected settings blocks in terraform
resource "azurerm_virtual_machine_extension" "example" {
name = "hostname"
virtual_machine_id = azurerm_virtual_machine.example.id
publisher = "SentinelOne.LinuxExtension"
type = "LinuxExtension"
type_handler_version = "1.0"
settings = <<SETTINGS
{
"commandToExecute": "powershell.exe -Command \"${local.powershell_command}\""
}
SETTINGS
tags = {
environment = "Production"
}
depends_on = [
azurerm_virtual_machine.example
]
}
Settings - The extension's settings are provided as a string-encoded JSON object.
protected settings In the same way that settings are supplied as a JSON object in a string, the protected settings passed to the extension are also.
The keys in the settings and protected settings blocks must be case sensitive according to some VM Extensions. Make sure they are consistent with how Azure expects them (for example, the keys for the JsonADDomainExtension extension the keys are supposed to be in TitleCase)
Reference: azurerm_virtual_machine_extension
Installing the plugin manually and checking the JSON output gives the following settings block:
{
"LinuxAgentVersion": "22.4.1.2",
"SiteToken": "<your_site_token_here"
}
Unfortunately, this leaves the one critical field required for installation out, since it's a protected setting. That is the field name for the "Sentinel One Console API token".
UPDATE:
Working extension example after finding the correct JSON key value:
resource "azurerm_virtual_machine_extension" "testserver-sentinelone-extension" {
name = "SentinelOneLinuxExtension"
virtual_machine_id = azurerm_linux_virtual_machine.testserver.id
publisher = "SentinelOne.LinuxExtension"
type = "LinuxExtension"
type_handler_version = "1.2"
automatic_upgrade_enabled = false
settings = <<SETTINGS
{
"LinuxAgentVersion": "22.4.1.2",
"SiteToken": "<your_site_token_here>"
}
SETTINGS
protected_settings = <<PROTECTEDSETTINGS
{
"SentinelOneConsoleAPIKey": "${var.sentinel_one_api_token}"
}
PROTECTEDSETTINGS
}
EDIT: Figured it out by once again manually installing the extension on another test system, and then digging into the waagent logs on that VM to see what value was being queried by the enable.sh script.
# cat /var/lib/waagent/SentinelOne.LinuxExtension.LinuxExtension-1.2.0/scripts/enable.sh | grep Console
api_token=$(echo "$protected_settings_decrypted" | jq -r ".SentinelOneConsoleAPIKey")

Terraform AzureRM domain-join fails due to account lockout

I'm creating terraform configuration files to rapidly create and destory demo environments for our prospective customers. These environments are pretty simple, containing a few VMs in single vnet with a single subnet, some for management, some for apps, and one as an AVD session host.
I have seen this work perfectly well a handfull of times, but most of the time it fails during the VM domain-join. When I troubleshoot the issue it's always because the account being used for the doman-join is locked out. I have confirmed this by connecting to the VM via the bastion and manually attempting the domain-join.
Here's my config to create the admin account used for domain join:
resource "azuread_group" "dc_admins" {
display_name = "AAD DC Administrators"
security_enabled = true
}
resource "azuread_user" "admin" {
user_principal_name = join("#", [var.admin_username, var.onmicrosoft_domain])
display_name = var.admin_username
password = var.admin_password
depends_on = [
azuread_group.dc_admins
]
}
resource "azuread_group_member" "admin" {
group_object_id = azuread_group.dc_admins.object_id
member_object_id = azuread_user.admin.object_id
depends_on = [
azuread_group.dc_admins,
azuread_user.admin
]
}
Here's my domain-join config:
resource "azurerm_virtual_machine_extension" "domain_join_mgmt_devices" {
name = "join-domain"
virtual_machine_id = azurerm_windows_virtual_machine.mgmtvm[count.index].id
publisher = "Microsoft.Compute"
type = "JsonADDomainExtension"
type_handler_version = "1.0"
depends_on = [
azurerm_windows_virtual_machine.mgmtvm,
azurerm_virtual_machine_extension.install_rsat_tools
]
count = "${var.vm_count}"
settings = <<SETTINGS
{
"Name": "${var.onmicrosoft_domain}",
"OUPath": "OU=AADDC Computers,DC=hidden,DC=onmicrosoft,DC=com",
"User": "${var.onmicrosoft_domain}\\${var.admin_username}",
"Restart": "true",
"Options": "3"
}
SETTINGS
protected_settings = <<PROTECTED_SETTINGS
{
"Password": "${var.admin_password}"
}
PROTECTED_SETTINGS
}
Here's the console output for the plan:
Terraform will perform the following actions:
# azurerm_virtual_machine_extension.domain_join_mgmt_devices[0] will be created
+ resource "azurerm_virtual_machine_extension" "domain_join_mgmt_devices" {
+ id = (known after apply)
+ name = "join-domain"
+ protected_settings = (sensitive value)
+ publisher = "Microsoft.Compute"
+ settings = jsonencode(
{
+ Name = "hidden.onmicrosoft.com"
+ OUPath = "OU=AADDC Computers,DC=hidden,DC=onmicrosoft,DC=com"
+ Options = "3"
+ Restart = "true"
+ User = "hidden.onmicrosoft.com\\admin_username"
}
)
+ type = "JsonADDomainExtension"
+ type_handler_version = "1.0"
+ virtual_machine_id = "/subscriptions/hiddensubid/resourceGroups/demo-rg/providers/Microsoft.Compute/virtualMachines/mgmt-vm1"
}
Plan: 1 to add, 0 to change, 0 to destroy.
I cannot for the life of my figure out what is locking the account. It's a fresh account, in a fresh subscription, created by the Terraform configuration prior to being used for the domain-join action.
Has anyone else seen anything like this?
Am I missing some knowledge about AAD, AD DS, cloud-only managed
domains?
I create the AAD DC Administrative user before creating the AAD DS managed domain. Could this be an issue?
Should I wait X minutes before creating an admin account and using it for administrative actions?
Is it possible, using Terraform and AzureRM, to prevent an AAD account from being locked out?

Azure Virtual Machine Extension fileUris path with Terraform

Need to implement VM extension, using Terraform and Azure DevOps.I am trying to pass fileUris value from .tfvars or create Dynamically from storage account details ["https://${var.Storageaccountname}.blob.core.windows.net/${var.containername}/test.sh"], Both the scenarios are not working.
resource "azurerm_virtual_machine_extension" "main" {
name = "${var.vm_name}"
location ="${azurerm_resource_group.resource_group.location}"
resource_group_name = "${azurerm_resource_group.resource_group.name}"
virtual_machine_name = "${azurerm_virtual_machine.vm.name}"
publisher = "Microsoft.Azure.Extensions"
type = "CustomScript"
type_handler_version = "2.0"
settings = <<SETTINGS
{
"fileUris" :"${var.fileUris}",
"commandToExecute": "sh <name of file> --ExecutionPolicy Unrestricted\""
}
SETTINGS
}
Any tips on fixing this issue? Maybe some other solution to achieve zero hardcoding in main.tf/variable.tf?
You could refer to this working sample to deploy the extension on Linux VM. The script file stored a Storage account.
resource "azurerm_virtual_machine_extension" "test" {
name = "test-LinuxExtension"
virtual_machine_id = "/subscriptions/xxx/virtualMachines/www"
publisher = "Microsoft.Azure.Extensions"
type = "CustomScript"
type_handler_version = "2.1"
auto_upgrade_minor_version = true
protected_settings = <<PROTECTED_SETTINGS
{
"commandToExecute": "sh aptupdate.sh",
"storageAccountName": "xxxxx",
"storageAccountKey": "xxxxx",
"fileUris": [
"${var.fileUris}"
]
}
PROTECTED_SETTINGS
}
If we store the script in Azure blob storage, we need to provide storage key then the extension can have permission to access the script. For more details, please refer to here. Please add the following setting in your script
...
protected_settings = <<PROTECTED_SETTINGS
{
"storageAccountName": "mystorageaccountname",
"storageAccountKey": "myStorageAccountKey"
}
PROTECTED_SETTINGS
...

Creating azure automation dsc configuration and dsc configuration node using terraform doesn't seems to be working

As a very first step of my release process I run the following terraform code
resource "azurerm_automation_account" "automation_account" {
for_each = data.terraform_remote_state.pod_bootstrap.outputs.ops_rg
name = "${local.automation_account_prefix}-${each.key}"
location = each.key
resource_group_name = each.value.name
sku_name = "Basic"
tags = {
environment = "development"
}
}
The automation accounts created as expected and I can see those in Azure portal.
I also have terraform code that creates a couple of windows VMs,each VM creation accompained by the following
resource "azurerm_virtual_machine_extension" "dsc" {
name = "DevOpsDSC"
virtual_machine_id = var.vm_id
publisher = "Microsoft.Powershell"
type = "DSC"
type_handler_version = "2.83"
settings = <<SETTINGS_JSON
{
"configurationArguments": {
"RegistrationUrl": "${var.dsc_server_endpoint}",
"NodeConfigurationName": "${var.dsc_config}",
"ConfigurationMode": "${var.dsc_mode}",
"ConfigurationModeFrequencyMins": 15,
"RefreshFrequencyMins": 30,
"RebootNodeIfNeeded": false,
"ActionAfterReboot": "continueConfiguration",
"AllowModuleOverwrite": true
}
}
SETTINGS_JSON
protected_settings = <<PROTECTED_SETTINGS_JSON
{
"configurationArguments": {
"RegistrationKey": {
"UserName": "PLACEHOLDER_DONOTUSE",
"Password": "${var.dsc_primary_access_key}"
}
}
}
PROTECTED_SETTINGS_JSON
}
The result is the following
So VM extension is created for each VM and the status says that provisioning succeeded.
For the next step I run the following terraform code
resource "azurerm_automation_dsc_configuration" "iswebserver" {
for_each = data.terraform_remote_state.pod_bootstrap.outputs.ops_rg
name = "iswebserver"
resource_group_name = each.value.name
automation_account_name = data.terraform_remote_state.ops.outputs.automation_account[each.key].name
location = each.key
content_embedded = "configuration iswebserver {}"
}
resource "azurerm_automation_dsc_nodeconfiguration" "iswebserver" {
for_each = data.terraform_remote_state.pod_bootstrap.outputs.ops_rg
name = "iswebserver.localhost"
resource_group_name = each.value.name
automation_account_name = data.terraform_remote_state.ops.outputs.automation_account[each.key].name
depends_on = [azurerm_automation_dsc_configuration.iswebserver]
content_embedded = file("${path.cwd}/iswebserver.mof")
}
The mof file content is the following
/*
#TargetNode='IsWebServer'
#GeneratedBy=P120bd0
#GenerationDate=02/25/2021 17:33:16
#GenerationHost=D-MJ05UA54
*/
instance of MSFT_RoleResource as $MSFT_RoleResource1ref
{
ResourceID = "[WindowsFeature]IIS";
IncludeAllSubFeature = True;
Ensure = "Present";
SourceInfo = "D:\\DSC\\testconfig.ps1::5::9::WindowsFeature";
Name = "Web-Server";
ModuleName = "PsDesiredStateConfiguration";
ModuleVersion = "1.0";
ConfigurationName = "TestConfig";
};
instance of OMI_ConfigurationDocument
{
Version="2.0.0";
MinimumCompatibleVersion = "1.0.0";
CompatibleVersionAdditionalProperties= {"Omi_BaseResource:ConfigurationName"};
Author="P120bd0";
GenerationDate="02/25/2021 17:33:16";
GenerationHost="D-MJ05UA54";
Name="TestConfig";
};
After running the code I have got the following result
The configuration is created as expected, clicking on configuration entry in UI grid, leads to the following
Meaning that node configuration is created as well. My expectation was that for each VM I will see the Node configured to run configuration provided in mof file but Nodes UI shows empty Nodes
So I was trying to configure node manually to connect all peaces together
and that fails with the following
So I am totally confisued. On the one hand there's azurerm_virtual_machine_extension that allows to create extension and bind it to the automation account. In addition there are azurerm_automation_dsc_configuration and azurerm_automation_dsc_nodeconfiguration that allows to create configuration and node configuration. But the bottom line is that you cannot connect all those dots to be able to create node.
Just to confirm that configuration is valid, I create additional vm without using azurerm_virtual_machine_extension and I was able succesfully add this MV to created node configuration
The problem was in azurerm_virtual_machine_extension dsc_configuration parameter. The value needs to be the same as name property of the azurerm_automation_dsc_nodeconfiguration resource.

Resources