How can additional disks be added to AKS nodes via azure template? - azure

When launching an AKS cluster, my nodes each have a main disk at /dev/sdb and a smaller temporary disk at /dev/sda. How can I attach an additional unformatted disk that will show up as /dev/sdc to each AKS node in my template. My current template is below:
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"resourceGroupName": {
"type": "string",
"metadata": {
"description": "The resource group name."
}
},
"subscriptionId": {
"type": "string",
"metadata": {
"description": "The subscription id."
}
},
"region": {
"type": "string",
"metadata": {
"description": "The region of AKS resource."
}
},
"gbPerNode": {
"type": "int",
"defaultValue": 20,
"metadata": {
"description": "Disk size (in GB) to provision for each of the agent pool nodes. This value ranges from 0 to 1023. Specifying 0 will apply the default disk size for that agentVMSize."
},
"minValue": 1,
"maxValue": 1023
},
"numNodes": {
"type": "int",
"defaultValue": 3,
"metadata": {
"description": "The number of agent nodes for the cluster."
},
"minValue": 1,
"maxValue": 50
},
"machineType": {
"type": "string",
"defaultValue": "Standard_D2_v2",
"metadata": {
"description": "The size of the Virtual Machine."
}
},
"servicePrincipalClientId": {
"metadata": {
"description": "Client ID (used by cloudprovider)"
},
"type": "securestring"
},
"servicePrincipalClientSecret": {
"metadata": {
"description": "The Service Principal Client Secret."
},
"type": "securestring"
},
"osType": {
"type": "string",
"defaultValue": "Linux",
"allowedValues": [
"Linux"
],
"metadata": {
"description": "The type of operating system."
}
},
"kubernetesVersion": {
"type": "string",
"defaultValue": "1.11.4",
"metadata": {
"description": "The version of Kubernetes."
}
},
"maxPods": {
"type": "int",
"defaultValue": 30,
"metadata": {
"description": "Maximum number of pods that can run on a node."
}
}
},
"variables": {
"deploymentEventTopic": "deploymenteventtopic",
"resourceGroupName": "[parameters('resourceGroupName')]",
"omswsName": "[concat('omsws-', parameters('resourceGroupName'))]",
"clustername": "cluster"
},
"resources": [
{
"apiVersion": "2018-03-31",
"type": "Microsoft.ContainerService/managedClusters",
"location": "[parameters('region')]",
"name": "[variables('clustername')]",
"properties": {
"kubernetesVersion": "[parameters('kubernetesVersion')]",
"enableRBAC": true,
"dnsPrefix": "clust",
"addonProfiles": {
"httpApplicationRouting": {
"enabled": true
},
"omsagent": {
"enabled": false
}
},
"agentPoolProfiles": [
{
"name": "agentpool",
"osDiskSizeGB": "[parameters('gbPerNode')]",
"count": "[parameters('numNodes')]",
"vmSize": "[parameters('machineType')]",
"osType": "[parameters('osType')]",
"storageProfile": "ManagedDisks"
}
],
"servicePrincipalProfile": {
"ClientId": "[parameters('servicePrincipalClientId')]",
"Secret": "[parameters('servicePrincipalClientSecret')]"
},
"networkProfile": {
"networkPlugin": "kubenet"
}
}
}
]
}

Unfortunately, it seems you cannot add disks to AKS nodes in the template. Take a look at all the properties in the template of AKS, there are no properties to do that.
If you really want to add disks to the nodes, maybe you can manually attach the disks to the VM in the AKS cluster. See attach a data disk to a Linux VM. Actually, the nodes in the cluster are the Azure VMs. So you can do things like what you do in the Azure VM.
But in my opinion, it's better to change a bigger size for the nodes when you create the AKS cluster if you want more disk space. See the properties about osDiskSizeGB and vmSize in the template. And you can add persist volumes to the Pod as you want. See Manually create and use a volume with Azure disks in Azure Kubernetes Service (AKS), I think it's more flexible and efficient to use the disk in this way.

This is what the template is supposed to look like for nodes with more disks:
{
"name": "nodepool1",
"count": 3,
"vmSize": "Standard_B2ms",
"osType": "Linux",
"osDiskSizeGB": 64,
"diskSizesGB": [
10,
10,
10,
10
]
}
unfortunately, despite this being a valid resource definition for AKS - it doesnt work yet, but at least when it starts to work, you will just use this snippet ;)

Related

HDInsight azure adls gen2 'InternalServerError' ARM Template deployment

Creating Azure HDinsight Spark cluster with ADLS Gen 2,Userassigned managed idnetity with StorageBlobdataOwner role.
Successfully assigned msi role to storage but getting error with HDInsight deployment(Internal server error)
Theres some issue near HDInsight cluster(Storage profile)resource code in the template i think. I could use some help here.Attached image below.
{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"clusterType": {
"type": "string",
"allowedValues": [ "hadoop", "hbase", "storm", "spark" ],
"metadata": {
"description": "The type of the HDInsight cluster to create."
}
},
"clusterName": {
"type": "string",
"metadata": {
"description": "The name of the HDInsight cluster to create."
}
},
"clusterLoginUserName": {
"type": "string",
"metadata": {
"description": "These credentials can be used to submit jobs to the cluster and to log into cluster dashboards."
}
},
"clusterLoginPassword": {
"type": "securestring",
"minLength": 10,
"metadata": {
"description": "The clusterloginpassword must be at least 10 characters in length and must contain at least one digit, one upper case letter, one lower case letter, and one non-alphanumeric character except (single-quote, double-quote, backslash, right-bracket, full-stop). Also, the password must not contain 3 consecutive characters from the cluster username or SSH username."
}
},
"sshUserName": {
"type": "string",
"metadata": {
"description": "These credentials can be used to remotely access the cluster and should not be same as clusterLoginUserName."
}
},
"sshPassword": {
"type": "securestring",
"minLength": 6,
"maxLength": 72,
"metadata": {
"description": "SSH password must be 6-72 characters long and must contain at least one digit, one upper case letter, and one lower case letter. It must not contain any 3 consecutive characters from the cluster login name"
}
},
"location": {
"type": "string",
"defaultValue": "[resourceGroup().location]",
"metadata": {
"description": "Location for all resources."
}
},
"HeadNodeVirtualMachineSize": {
"type": "string",
"defaultValue": "Standard_D12_v2",
"allowedValues": [
"Standard_A4_v2",
"Standard_A8_v2",
"Standard_D3_v2",
"Standard_D4_v2",
"Standard_D5_v2",
"Standard_D12_v2",
"Standard_D13_v2"
],
"metadata": {
"description": "This is the headnode Azure Virtual Machine size, and will affect the cost. If you don't know, just leave the default value."
}
},
"WorkerNodeVirtualMachineSize": {
"type": "string",
"defaultValue": "Standard_D13_v2",
"allowedValues": [
"Standard_A4_v2",
"Standard_A8_v2",
"Standard_D1_v2",
"Standard_D2_v2",
"Standard_D3_v2",
"Standard_D4_v2",
"Standard_D5_v2",
"Standard_D12_v2",
"Standard_D13_v2"
],
"metadata": {
"description": "This is the workerdnode Azure Virtual Machine size, and will affect the cost. If you don't know, just leave the default value."
}
},
"clusterHeadNodeCount": {
"type": "int",
"defaultValue": 2,
"metadata": {
"description": "Number of worker nodes"
}
},
"clusterWorkerNodeCount": {
"type": "int",
"defaultValue": 4,
"metadata": {
"description": "Number of worker nodes"
}
},
"StorageAccountName": {
"type": "string",
"metadata": {
"description": "Name of the Storage Account"
}
},
"StorageAccountType": {
"type": "string",
"defaultValue": "Standard_LRS",
"allowedValues": [
"Standard_LRS",
"Standard_GRS",
"Standard_ZRS",
"Standard_RA-GRS"
],
"metadata": {
"description": "Type of the Storage Account"
}
},
"filesystemname": {
"type": "string",
"metadata": {
"description": "Name of the container"
}
},
"UserAssignedIdentityName": {
"type": "string",
"metadata": {
"description": "Name of the User Assigned Identity"
}
}
},
"variables": {
"managedIdentityId": "[concat('/subscriptions/', subscription().subscriptionId, '/resourceGroups/',resourceGroup().name, '/providers/Microsoft.ManagedIdentity/userAssignedIdentities/', parameters('UserAssignedIdentityName'))]",
"StorageApiVersion": "2019-06-01",
"msiApiVersion": "2018-11-30",
"HDInsightApiVersion": "2015-03-01-preview",
"StorageBlobDataOwner": "[concat('/subscriptions/', subscription().subscriptionId, '/providers/Microsoft.Authorization/roleDefinitions/', 'b7e6dc6d-f1e8-4753-8033-0f276bb0955b')]",
"StorageBlobDataContributor": "[concat('/subscriptions/', subscription().subscriptionId, '/providers/Microsoft.Authorization/roleDefinitions/', 'ba92f5b4-2d11-453d-a403-e96b0029c9fe')]"
},
"resources": [
{
"name": "[parameters('UserAssignedIdentityName')]",
"type": "Microsoft.ManagedIdentity/userAssignedIdentities",
"apiVersion": "[variables('msiApiVersion')]",
"location": "[resourceGroup().location]"
},
{
"type": "Microsoft.Storage/storageAccounts",
"apiVersion": "[variables('StorageApiVersion')]",
"name": "[parameters('StorageAccountName')]",
"location": "[parameters('location')]",
"sku": {
"name": "[parameters('StorageAccountType')]"
},
"kind": "StorageV2",
"properties": {
"encryption": {
"keySource": "Microsoft.Storage",
"services": {
"blob": {
"enabled": true
},
"file": {
"enabled": true
}
}
},
"isHnsEnabled": true,
"supportsHttpsTrafficOnly": true
}
},
{
"type": "Microsoft.Storage/storageAccounts/providers/roleAssignments",
"apiVersion": "2018-01-01-preview",
"name": "[concat(parameters('StorageAccountName'),'/Microsoft.Authorization/',guid(subscription().subscriptionId))]",
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts',parameters('StorageAccountName'))]",
"[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities',parameters('UserAssignedIdentityName'))]"
],
"properties": {
"roleDefinitionId": "[variables('StorageBlobDataOwner')]",
"principalId": "[reference(variables('managedIdentityId'),variables('msiApiVersion')).principalId]"
}
},
{
"apiVersion": "[variables('HDInsightApiVersion')]",
"name": "[parameters('clusterName')]",
"type": "Microsoft.HDInsight/clusters",
"location": "[parameters('location')]",
"dependsOn": [
"[resourceId('Microsoft.Storage/storageAccounts',parameters('StorageAccountName'))]",
"[resourceId('Microsoft.ManagedIdentity/userAssignedIdentities',parameters('UserAssignedIdentityName'))]"
],
"properties": {
"clusterVersion": "4.0",
"osType": "Linux",
"tier": "standard",
"clusterDefinition": {
"kind": "[parameters('clusterType')]",
"componentVersion": {
"Spark": "2.3"
},
"configurations": {
"gateway": {
"restAuthCredential.isEnabled": true,
"restAuthCredential.username": "[parameters('clusterLoginUserName')]",
"restAuthCredential.password": "[parameters('clusterLoginPassword')]"
}
}
},
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"[variables('managedIdentityId')]": {}
}
},
"storageProfile": {
"storageaccounts": [
{
"name": "[concat(parameters('StorageAccountName'),'.blob.core.windows.net')]",
"isDefault": true,
"fileSystem": "[parameters('filesystemname')]",
"resourceId": "[reference(resourceId('Microsoft.Storage/storageAccounts',parameters('StorageAccountName')),variables('StorageApiVersion'))]",
"msiResourceId": "[reference(resourceId('Microsoft.ManagedIdentity/userAssignedIdentities',parameters('UserAssignedIdentityName')),variables('msiApiVersion'))]"
}
]
},
"computeProfile": {
"roles": [
{
"name": "headnode",
"minInstanceCount": 1,
"targetInstanceCount": "[parameters('clusterHeadNodeCount')]",
"hardwareProfile": {
"vmSize": "[parameters('HeadNodeVirtualMachineSize')]"
},
"osProfile": {
"linuxOperatingSystemProfile": {
"username": "[parameters('sshUserName')]",
"password": "[parameters('sshPassword')]"
}
},
"virtualNetworkProfile": null,
"scriptActions": []
},
{
"name": "workernode",
"targetInstanceCount": "[parameters('clusterWorkerNodeCount')]",
"autoscale": {
"capacity": {
"minInstanceCount": 3,
"maxInstanceCount": 10
}
},
"hardwareProfile": {
"vmSize": "[parameters('WorkerNodeVirtualMachineSize')]"
},
"osProfile": {
"linuxOperatingSystemProfile": {
"username": "[parameters('sshUserName')]",
"password": "[parameters('sshPassword')]"
}
},
"virtualNetworkProfile": null,
"scriptActions": []
}
]
}
}
}
],
"outputs": {
"storage": {
"type": "object",
"value": "[reference(resourceId('Microsoft.Storage/storageAccounts', parameters('StorageAccountName')))]"
},
"cluster": {
"type": "object",
"value": "[reference(resourceId('Microsoft.HDInsight/clusters', parameters('clusterName')))]"
}
}
}
InternalServerError and Operation detail shows "Anerror has occured" and no other info
Update: Ensure that your storage account has the user-assigned identity with Storage Blob Data Contributor role permissions, otherwise cluster creation will fail.
If you are using Azure Data Lake Storage Gen2 and receive the error AmbariClusterCreationFailedErrorCode: "Internal server error occurred while processing the request. Please retry the request or contact support.".
To resolve this issue, open the Azure portal, go to your Storage account, and under Access Control (IAM), ensure that the Storage Blob Data Contributor or the Storage Blob Data Owner role has Assigned access to the User assigned managed identity for the subscription. See Set up permissions for the managed identity on the Data Lake Storage Gen2 account for detailed instructions.
Make sure you have followed the necessary steps to configure a Data Lake Storage gen2 account.
Reference: Use Azure Data Lake Storage Gen2 with Azure HDInsight clusters

How to use existing scale set as cluster node in Azure Service Fabric cluster

I am trying to deploy Service Fabric cluster through ARM template and attach the existing scale set. The pipeline is getting executed properly with no error but when i open service fabric in portal the status is "waiting for nodes". I don't know where i am making mistake. I am using the same certificate thumbprint which is there in scale set. my certificate is stored in KeyVault. Here is my ARM template
{
"$schema": "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json",
"contentVersion": "1.0.0.0",
"parameters": {
"clusterName": {
"type": "string",
"defaultValue": "GEN-UNIQUE",
"metadata": {
"description": "Name of your cluster - Between 3 and 23 characters. Letters and numbers only"
}
},
"clusterLocation": {
"type": "string",
"defaultValue": "westus",
"metadata": {
"description": "Location of the Cluster"
}
},
"applicationStartPort": {
"type": "int",
"defaultValue": 20000
},
"applicationEndPort": {
"type": "int",
"defaultValue": 30000
},
"ephemeralStartPort": {
"type": "int",
"defaultValue": 49152
},
"ephemeralEndPort": {
"type": "int",
"defaultValue": 65534
},
"fabricTcpGatewayPort": {
"type": "int",
"defaultValue": 19000
},
"fabricHttpGatewayPort": {
"type": "int",
"defaultValue": 19080
},
"clusterProtectionLevel": {
"type": "string",
"allowedValues": [
"None",
"Sign",
"EncryptAndSign"
],
"defaultValue": "EncryptAndSign",
"metadata": {
"description": "Protection level.Three values are allowed - EncryptAndSign, Sign, None. It is best to keep the default of EncryptAndSign, unless you have a need not to"
}
},
"certificateThumbprint": {
"type": "string",
"defaultValue": "GEN-CUSTOM-DOMAIN-SSLCERT-THUMBPRINT",
"metadata": {
"description": "Certificate Thumbprint"
}
},
"certificateStoreValue": {
"defaultValue": "My",
"allowedValues": [
"My"
],
"type": "string",
"metadata": {
"description": "The store name where the cert will be deployed in the virtual machine"
}
},
"supportLogStorageAccountName": {
"type": "string",
"defaultValue": "[toLower( concat('sflogs', uniqueString(resourceGroup().id),'2'))]",
"metadata": {
"description": "Name for the storage account that contains support logs from the cluster"
}
},
"blobEndpoint":{
"type": "string"
},
"queueEndpoint":{
"type": "string"
},
"tableEndpoint":{
"type": "string"
},
"InstanceCount": {
"type": "int",
"defaultValue": 5,
"metadata": {
"description": "Instance count for node type"
}
},
"vmNodeTypeName": {
"type": "string"
},
"nodeTypes":{
"type": "array"
},
"lbIPName": {
"type": "string"
},
"fqdn":{
"type": "string"
},
"reliabilityLevel":{
"type": "string"
},
"upgradeMode":{
"type": "string"
}
},
"variables":{
"storageApiVersion": "2016-01-01",
"publicIPApiVersion": "2015-06-15"
},
"resources": [
{
"apiVersion": "2018-02-01",
"type": "Microsoft.ServiceFabric/clusters",
"name": "[parameters('clusterName')]",
"location": "[parameters('clusterLocation')]",
"dependsOn": [],
"properties": {
"addonFeatures": [
"DnsService"
],
"certificate": {
"thumbprint": "[parameters('certificateThumbprint')]",
"x509StoreName": "[parameters('certificateStoreValue')]"
},
"clientCertificateCommonNames": [],
"clientCertificateThumbprints": [],
"clusterState": "Default",
"diagnosticsStorageAccountConfig": {
"storageAccountName": "[parameters('supportLogStorageAccountName')]",
"protectedAccountKeyName": "StorageAccountKey1",
"blobEndpoint": "[parameters('blobEndpoint')]",
"queueEndpoint": "[parameters('queueEndpoint')]",
"tableEndpoint": "[parameters('tableEndpoint')]"
},
"fabricSettings": [
{
"parameters": [
{
"name": "ClusterProtectionLevel",
"value": "[parameters('clusterProtectionLevel')]"
}
],
"name": "Security"
}
],
"managementEndpoint": "[concat('https://',parameters('fqdn'),':',parameters('fabricHttpGatewayPort'))]",
"nodeTypes": "[parameters('nodeTypes')]",
"reliabilityLevel": "[parameters('reliabilityLevel')]",
"upgradeMode": "[parameters('upgradeMode')]"
}
}
]
}
For this deployment error, you can look through these problems and solutions in this blog.
It might be caused by the Certificate Thumbprint Issue and KeyVault issue.
If it's no luck, try to change the VM sizes or change the region of the nodes or just rebuild like this.
For more reference about SFC deployment with key vault cert, you also could refer to this article.

Error with JsonADDomainExtension and copy loop

I have developed and ARM Template that will build multiple VMs in parallel. Each machine has an OS Disk, and a vNIC. I can use PowerShell to deploy the ARM Template or I can upload the ARM Template and deploy it using the "Deploy Custom Template" feature of the Azure Portal.
However I cannot quite seem to get the Syntax correct to add in the JsonADDomainExtension with a copy loop so that all machines get added to my domain in the same template.
At the moment all my resources are in the one Resource Group. I have pre-deployed a vNET with a Subnet, and a storage account.
Below is the ARM Template that works deploying multiple machines at once.
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters":
{
"adminUsername":
{
"type": "string",
"metadata": {"description": "Administrator username for the Virtual Machine."}
},
"adminPassword":
{
"type": "securestring",
"metadata": {"description": "Password for the Virtual Machine.Should contain any 3 of: 1 Lower Case, 1 Upper Case, 1 Number and 1 Special character."}
},
"numberOfInstances":
{
"type": "int",
"defaultValue": 3,
"minValue": 2,
"maxValue": 5,
"metadata": {"description": "Number of VMs to deploy, limit 5 since this sample is using a single storage account."}
},
"StartOfCount":
{
"type": "int",
"defaultValue": 1,
"metadata": {"description": "Number that you want to start the machine name numbering from, EG to create 5 VMs with the name starting at 045 (W2016SRV-045-vm) you would enter 45 here, or to start at 1 just enter 1 here, this wil give you a machine name like 'W2016SRV-001-vm'."}
},
"vmNamePrefix":
{
"type": "string",
"defaultValue": "W2016SRV-",
"maxLength": 9,
"metadata": {"description": "The VM name prefix maximum 9 characters. This allows for three digits in the name and trailing '-vm'. EG: 'W2016SRV-045-vm'."}
},
"vmSize":
{
"type": "string",
"defaultValue": "Standard_B2ms",
"metadata":
{
"description": "The size(T-shirt) for the VM. (Standard_D8s_v3)",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Compute/virtualMachines/vmSize"}
}
},
"vmLocation":
{
"type": "string",
"defaultValue": "AustraliaSoutheast",
"metadata":
{
"description": "Location or datacenter where the VM will be placed.",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Azure/region"}
}
},
"domainToJoin":
{
"type": "string",
"metadata": {"description": "The FQDN of the AD domain"}
},
"domainUsername":
{
"type": "string",
"metadata": {"description": "Username of the account on the domain"}
},
"domainPassword":
{
"type": "securestring",
"metadata": {"description": "Password of the account on the domain"}
},
"ouPath":
{
"type": "string",
"metadata": {"description": "Specifies an organizational unit (OU) for the domain account. Enter the full distinguished name of the OU in quotation marks. Example: 'OU=testOU; DC=domain; DC=Domain; DC=com"}
},
"sizeOfDiskInGB":
{
"type": "int",
"defaultValue": 200,
"metadata": {"description": "The disk size for the OS drive in the VM, in GBs. Default Value is 500"}
},
"existingBootDiagStorageResourceGroup":
{
"type": "string",
"metadata": {"description": "Storage account resource group name where boot diagnistics will be stored"}
},
"existingBootDiagStorageName":
{
"type": "string",
"metadata":
{
"description": "Storage account name where boot diagnistics will be stored. It should be at the same location as the VM.",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Storage/storageAccounts"}
}
},
"existingvNetResourceGroup":
{
"type": "string",
"metadata":
{
"description": "Resource Group of the Existing Virtual Network.",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Resources/resourceGroups"}
}
},
"existingvNetName":
{
"type": "string",
"metadata":
{
"description": "Existing Virtual Network to connect to Network Interface to.It should be at the same location as the VM.",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Network/virtualNetworks"}
}
},
"subnetName":
{
"type": "string",
"metadata":
{
"description": "The Subnet for the VM.",
"SNC::Parameter::Metadata": {"referenceType": "Microsoft.Network/subNets"}
}
}
},
"variables":
{
"storageAccountType": "Standard_LRS",
"vnetID": "[resourceId(parameters('existingvNetResourceGroup'), 'Microsoft.Network/virtualNetworks', parameters('existingvNetName'))]",
"subnetRef": "[concat(variables('vnetID'),'/subnets/', parameters('subnetName'))]",
"StartOfCountString": "[String(Parameters('StartOfCount'))]"
},
"resources":
[
{
"apiVersion": "[providers('Microsoft.Network','networkInterfaces').apiVersions[0]]",
"type": "Microsoft.Network/networkInterfaces",
"name": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm-vNic')]",
"location": "[parameters('vmLocation')]",
"copy": {
"name": "nicLoop",
"count": "[parameters('numberOfInstances')]"
},
"properties":
{
"ipConfigurations":
[
{
"name": "Prod",
"properties":
{
"privateIPAllocationMethod": "Dynamic",
"subnet": {"id": "[variables('subnetRef')]"}
}
}
]
}
},
{
"apiVersion": "[providers('Microsoft.Compute','virtualMachines').apiVersions[0]]",
"type": "Microsoft.Compute/virtualMachines",
"name": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm')]",
"location": "[parameters('vmLocation')]",
"copy": {
"name": "vmLoop",
"count": "[parameters('numberOfInstances')]"
},
"dependsOn": ["nicLoop"],
"tags":
{
"Project": "***"
},
"properties":
{
"hardwareProfile": {"vmSize": "[parameters('vmSize')]"},
"osProfile":
{
"computerName": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm')]",
"adminUsername": "[parameters('adminUsername')]",
"adminPassword": "[parameters('adminPassword')]",
"windowsConfiguration": {"enableAutomaticUpdates": false}
},
"storageProfile":
{
"imageReference": {
"id": "/subscriptions/***/resourceGroups/***/providers/Microsoft.Compute/images/***"
},
"osDisk":
{
"name": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm-osDisk')]",
"createOption": "FromImage",
"diskSizeGB": "[parameters('sizeOfDiskInGB')]",
"caching": "ReadWrite",
"managedDisk": {"storageAccountType": "[variables('storageAccountType')]"}
}
},
"networkProfile": {"networkInterfaces": [{"id": "[resourceId('Microsoft.Network/networkInterfaces',concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm-vNic'))]"}]},
"diagnosticsProfile":
{
"bootDiagnostics":
{
"enabled": true,
"storageUri": "[reference(resourceId(parameters('existingBootDiagStorageResourceGroup'), 'Microsoft.Storage/storageAccounts', parameters('existingBootDiagStorageName')), '2015-06-15').primaryEndpoints['blob']]"
}
}
}
}
]
}
However, If I add in this to the resources bit at the bottom then it doesn't work.
{
"apiVersion": "2016-03-30",
"type": "Microsoft.Compute/virtualMachines/extensions",
"name": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(variables('StartOfCountString')),3, '0'), '/JoinDomain')]",
"location": "[resourceGroup().location]",
"copy": {
"name": "DomainJoinLoop",
"count": "[parameters('numberOfInstances')]"
},
"dependsOn": "[concat('Microsoft.Compute/virtualMachines/', concat(parameters('vmNamePrefix'), padLeft(copyIndex(variables('StartOfCountString')),3, '0')))]",
"properties": {
"publisher": "Microsoft.Compute",
"type": "JsonADDomainExtension",
"typeHandlerVersion": "1.3",
"autoUpgradeMinorVersion": true,
"settings": {
"Name": "[parameters('domainToJoin')]",
"User": "[concat(parameters('domainToJoin'), '\\', parameters('domainUsername'))]",
"OUPath": "[parameters('ouPath')]",
"Restart": "true",
"Options": "3"
},
"protectedsettings": {
"Password": "[parameters('domainPassword')]"
}
}
}
In the copyIndex portions of the code I've tried both variables and parameters to set the offset for the copyIndex. I've also tried hard coding it. I keep getting an error like this.
{"telemetryId":"649e0a7f-2c22-41f9-bede-c13618f5053a",
"bladeInstanceId":"Blade_b45c7efadff74340820345aa2e9d76c1_26_0",
"galleryItemId":"Microsoft.Template","createBlade": "DeployToAzure",
"code":"InvalidRequestContent", "message":"The request content was
invalid and could not be deserialized: 'Error converting value
\"[concat('Microsoft.Compute/virtualMachines/',
concat(parameters('vmNamePrefix'),
padLeft(copyIndex(variables('StartOfCountString')),3, '0')))]\" to
type 'System.String[]'. Path
'properties.template.resources[1].resources[0].dependsOn', line 259,
position 171.'."}
The reason that I want to set an offset for the copy index is so that I can build say ten machines like W2016SRV-001, 002, ... 010, and then later on build more machines starting at 011, 012, 013, etc.
There are other ways around it but I really want to be able to do this in an ARM Template and I'm not yet aware after all my extensive googling why it shouldn't be possible.
...............................................................................
OK so after editing the template as suggested by #4c74356b41 I'm now getting a completely new and different error when I try and deploy the template through the portal.
{"telemetryId":"649e0a7f-2c22-41f9-bede-c13618f5053a",
"bladeInstanceId":"Blade_b45c7efadff74340820345aa2e9d76c1_26_0",
"galleryItemId":"Microsoft.Template","createBlade":"DeployToAzure",
"code":"InvalidTemplate",
"message":"Deployment template validation failed: 'The template resource
'[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'),
'-vm/JoinDomain')]'
at line '250' column '13' is not valid. Copying nested resources is not supported.
Please see https://aka.ms/arm-copy/#looping-on-a-nested-resource for usage details.'."}
So following the link in the error message and reading some more ... what I needed to do was drop the Domain Join Extension out as a child of the VM and have it as a top level resource. Woot woot it worked.
So now the end of my ARM Template looks like this.
*** VM bits are here ***
}
}
}, <-- end of the VM bits.
{
"apiVersion": "2016-03-30",
"type": "Microsoft.Compute/virtualMachines/extensions",
"name": "[concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm/JoinDomain')]",
"location": "[resourceGroup().location]",
"copy": {
"name": "DomainJoinLoop",
"count": "[parameters('numberOfInstances')]"
},
"dependsOn": [ "vmLoop" ],
"properties":
{
"publisher": "Microsoft.Compute",
"type": "JsonADDomainExtension",
"typeHandlerVersion": "1.3",
"autoUpgradeMinorVersion": true,
"settings":
{
"Name": "[parameters('domainToJoin')]",
"User": "[concat(parameters('domainToJoin'), '\\', parameters('domainUsername'))]",
"OUPath": "[parameters('ouPath')]",
"Restart": "true",
"Options": "3"
},
"protectedsettings": {"Password": "[parameters('domainPassword')]"}
}
}
],
"outputs":
{}
}
Multiple VMs deployed in parallel and all joined to the domain without issue.
Sooo my lessons learned are these and I hope that they help someone else in the future.
You cannot use copyIndex on a Child resource.
Make sure that you name your DomainJoin Extension properly.
Compare your vm name and extension name:
concat(parameters('vmNamePrefix'), padLeft(copyIndex(parameters('StartOfCount')),3, '0'), '-vm')
concat(parameters('vmNamePrefix'), padLeft(copyIndex(variables('StartOfCountString')),3, '0'), '/JoinDomain')
besides the fact you are using an unneeded variable you are missing -vm. The extension name has to be in the format parent_name/extension_name else it wouldn't know to which parent resource this extension belongs (it must always belong to some vm).
Your dependsOn is also wrong, you can simplify it to this:
"dependsOn": [ "vmLoop" ]

Azure Kubernetes Service ARM template is not idempotent

I have created an ARM template to deploy an Azure Kubernetes Service instance, which I am trying to plug into a CI/CD pipeline in VSTS. On the first deployment, everything works as expected and the K8s cluster is created successfully. However, upon redeployment, the template fails the validation stage with the following error:
{
"message": "The template deployment 'Microsoft.Template' is not valid according to the validation procedure."
"details": [
{
"code":"PropertyChangeNotAllowed",
"message":"Provisioning of resource(s) for container service <cluster name> in resource group <resource group name> failed. Message:"
{
"code": "PropertyChangeNotAllowed",
"message": "Changing property 'linuxProfile.ssh.publicKeys.keyData' is not allowed.",
"target": "linuxProfile.ssh.publicKeys.keyData"
}
}
]
}
The template is therefore clearly not idempotent which completely dishonours the intended nature of ARM template deployments.
Has anyone managed to find a workaround for this?
The solution to this is to specify the SSH RSA Public Key as a template parameter and use it when configuring the Linux profile. I have posted my ARM template below:
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"clusterName": {
"type": "string",
"metadata": {
"description": "The name of the Kubernetes cluster."
}
},
"location": {
"type": "string",
"metadata": {
"description": "The data center in which to deploy the Kubernetes cluster."
}
},
"dnsPrefix": {
"type": "string",
"metadata": {
"description": "DNS prefix to use with hosted Kubernetes API server FQDN."
}
},
"osDiskSizeGB": {
"defaultValue": 32,
"minValue": 0,
"maxValue": 1023,
"type": "int",
"metadata": {
"description": "Disk size (in GB) to provision for each of the agent pool nodes. This value ranges from 0 to 1023. Specifying 0 will apply the default disk size for that agentVMSize."
}
},
"agentCount": {
"defaultValue": 1,
"minValue": 1,
"maxValue": 50,
"type": "int",
"metadata": {
"description": "The number of agent nodes for the cluster."
}
},
"agentVMSize": {
"defaultValue": "Standard_D1_v2",
"type": "string",
"metadata": {
"description": "The size of the Virtual Machine."
}
},
"servicePrincipalClientId": {
"type": "securestring",
"metadata": {
"description": "The Service Principal Client ID."
}
},
"servicePrincipalClientSecret": {
"type": "securestring",
"metadata": {
"description": "The Service Principal Client Secret."
}
},
"osType": {
"defaultValue": "Linux",
"allowedValues": [
"Linux"
],
"type": "string",
"metadata": {
"description": "The type of operating system."
}
},
"kubernetesVersion": {
"defaultValue": "1.10.6",
"type": "string",
"metadata": {
"description": "The version of Kubernetes."
}
},
"enableOmsAgent": {
"defaultValue": true,
"type": "bool",
"metadata": {
"description": "boolean flag to turn on and off of omsagent addon"
}
},
"enableHttpApplicationRouting": {
"defaultValue": true,
"type": "bool",
"metadata": {
"description": "boolean flag to turn on and off of http application routing"
}
},
"networkPlugin": {
"defaultValue": "kubenet",
"allowedValues": [
"azure",
"kubenet"
],
"type": "string",
"metadata": {
"description": "Network plugin used for building Kubernetes network."
}
},
"enableRBAC": {
"defaultValue": true,
"type": "bool",
"metadata": {
"description": "Flag to turn on/off RBAC"
}
},
"logAnalyticsWorkspaceName": {
"type": "string",
"metadata": {
"description": "Name of the log analytics workspace which will be used for container analytics"
}
},
"logAnalyticsWorkspaceLocation": {
"type": "string",
"metadata": {
"description": "The data center in which the log analytics workspace is deployed"
}
},
"logAnalyticsResourceGroup": {
"type": "string",
"metadata": {
"description": "The resource group in which the log analytics workspace is deployed"
}
},
"vmAdminUsername": {
"type": "string",
"metadata": {
"description": "User name for the Linux Virtual Machines."
}
},
"sshRsaPublicKey": {
"type": "securestring",
"metadata": {
"description": "Configure all linux machines with the SSH RSA public key string. Your key should include three parts, for example: 'ssh-rsa AAAAB...snip...UcyupgH azureuser#linuxvm'"
}
}
},
"variables": {
"logAnalyticsWorkspaceId": "[resourceId(parameters('logAnalyticsResourceGroup'), 'Microsoft.OperationalInsights/workspaces', parameters('logAnalyticsWorkspaceName'))]",
"containerInsightsName": "[concat(parameters('clusterName'),'-containerinsights')]"
},
"resources": [
{
"type": "Microsoft.ContainerService/managedClusters",
"name": "[parameters('clusterName')]",
"apiVersion": "2018-03-31",
"location": "[parameters('location')]",
"properties": {
"kubernetesVersion": "[parameters('kubernetesVersion')]",
"enableRBAC": "[parameters('enableRBAC')]",
"dnsPrefix": "[parameters('dnsPrefix')]",
"addonProfiles": {
"httpApplicationRouting": {
"enabled": "[parameters('enableHttpApplicationRouting')]"
},
"omsagent": {
"enabled": "[parameters('enableOmsAgent')]",
"config": {
"logAnalyticsWorkspaceResourceID": "[variables('logAnalyticsWorkspaceId')]"
}
}
},
"agentPoolProfiles": [
{
"name": "agentpool",
"osDiskSizeGB": "[parameters('osDiskSizeGB')]",
"count": "[parameters('agentCount')]",
"vmSize": "[parameters('agentVMSize')]",
"osType": "[parameters('osType')]",
"storageProfile": "ManagedDisks"
}
],
"linuxProfile": {
"adminUsername": "[parameters('vmAdminUsername')]",
"ssh": {
"publicKeys": [
{
"keyData": "[parameters('sshRsaPublicKey')]"
}
]
}
},
"servicePrincipalProfile": {
"clientId": "[parameters('servicePrincipalClientId')]",
"secret": "[parameters('servicePrincipalClientSecret')]"
},
"networkProfile": {
"networkPlugin": "[parameters('networkPlugin')]"
}
},
"dependsOn": [
"[concat('Microsoft.Resources/deployments/', 'SolutionDeployment')]"
]
},
{
"type": "Microsoft.Resources/deployments",
"name": "SolutionDeployment",
"apiVersion": "2017-05-10",
"resourceGroup": "[parameters('logAnalyticsResourceGroup')]",
"properties": {
"mode": "Incremental",
"template": {
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources": [
{
"apiVersion": "2015-11-01-preview",
"type": "Microsoft.OperationsManagement/solutions",
"location": "[parameters('logAnalyticsWorkspaceLocation')]",
"name": "[variables('containerInsightsName')]",
"properties": {
"workspaceResourceId": "[variables('logAnalyticsWorkspaceId')]"
},
"plan": {
"name": "[variables('containerInsightsName')]",
"product": "OMSGallery/ContainerInsights",
"promotionCode": "",
"publisher": "Microsoft"
}
}
]
}
}
}
],
"outputs": {
"controlPlaneFQDN": {
"type": "string",
"value": "[reference(concat('Microsoft.ContainerService/managedClusters/', parameters('clusterName'))).fqdn]"
},
"sshMaster0": {
"type": "string",
"value": "[concat('ssh ', parameters('vmAdminUsername'), '#', reference(concat('Microsoft.ContainerService/managedClusters/', parameters('clusterName'))).fqdn, ' -A -p 22')]"
}
}
}

Cannot change agent VM count

I have an ACS Kubernetes cluster that was created with an agent count of 1. I went to the portal to increase the agent count to 2 and received a generic error saying the provisioning of resource(s) for container service failed.
Looking at the activity logs, there is a bit more information.
Write ContainerServices - PreconditionFailed - Provisioning of resource(s) for container service 'xxxxxxx' in
resource group 'xxxxxxxx' failed.
Validate - InvalidTemplate - Deployment template validation failed: 'The resource 'Microsoft.Network/networkSecurityGroups/k8s-master-3E4D5818-nsg' is not defined in the template. Please see https://aka.ms/arm-template for usage details.'.
Trying to change it via the Azure CLI 2.0 also returns the same error.
Update: The cluster was stood up using an ARM template with a single container service resource based on the sample in the quickstart templates repo.
{
"$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"parameters": {
"dnsNamePrefix": {
"type": "string",
"metadata": {
"description": "Sets the Domain name prefix for the cluster. The concatenation of the domain name and the regionalized DNS zone make up the fully qualified domain name associated with the public IP address."
}
},
"agentCount": {
"type": "int",
"defaultValue": 1,
"metadata": {
"description": "The number of agents for the cluster. This value can be from 1 to 100 (note, for Kubernetes clusters you will also get 1 or 2 public agents in addition to these seleted masters)"
},
"minValue":1,
"maxValue":100
},
"agentVMSize": {
"type": "string",
"defaultValue": "Standard_D2_v2",
"allowedValues": [
"Standard_A0", "Standard_A1", "Standard_A2", "Standard_A3", "Standard_A4", "Standard_A5",
"Standard_A6", "Standard_A7", "Standard_A8", "Standard_A9", "Standard_A10", "Standard_A11",
"Standard_D1", "Standard_D2", "Standard_D3", "Standard_D4",
"Standard_D11", "Standard_D12", "Standard_D13", "Standard_D14",
"Standard_D1_v2", "Standard_D2_v2", "Standard_D3_v2", "Standard_D4_v2", "Standard_D5_v2",
"Standard_D11_v2", "Standard_D12_v2", "Standard_D13_v2", "Standard_D14_v2",
"Standard_G1", "Standard_G2", "Standard_G3", "Standard_G4", "Standard_G5",
"Standard_DS1", "Standard_DS2", "Standard_DS3", "Standard_DS4",
"Standard_DS11", "Standard_DS12", "Standard_DS13", "Standard_DS14",
"Standard_GS1", "Standard_GS2", "Standard_GS3", "Standard_GS4", "Standard_GS5"
],
"metadata": {
"description": "The size of the Virtual Machine."
}
},
"linuxAdminUsername": {
"type": "string",
"defaultValue": "azureuser",
"metadata": {
"description": "User name for the Linux Virtual Machines."
}
},
"orchestratorType": {
"type": "string",
"defaultValue": "Kubernetes",
"allowedValues": [
"Kubernetes",
"DCOS",
"Swarm"
],
"metadata": {
"description": "The type of orchestrator used to manage the applications on the cluster."
}
},
"masterCount": {
"type": "int",
"defaultValue": 1,
"allowedValues": [
1
],
"metadata": {
"description": "The number of Kubernetes masters for the cluster."
}
},
"sshRSAPublicKey": {
"type": "string",
"metadata": {
"description": "Configure all linux machines with the SSH RSA public key string. Your key should include three parts, for example 'ssh-rsa AAAAB...snip...UcyupgH azureuser#linuxvm'"
}
},
"servicePrincipalClientId": {
"metadata": {
"description": "Client ID (used by cloudprovider)"
},
"type": "securestring",
"defaultValue": "n/a"
},
"servicePrincipalClientSecret": {
"metadata": {
"description": "The Service Principal Client Secret."
},
"type": "securestring",
"defaultValue": "n/a"
}
},
"variables": {
"adminUsername":"[parameters('linuxAdminUsername')]",
"agentCount":"[parameters('agentCount')]",
"agentsEndpointDNSNamePrefix":"[concat(parameters('dnsNamePrefix'),'agents')]",
"agentVMSize":"[parameters('agentVMSize')]",
"masterCount":"[parameters('masterCount')]",
"mastersEndpointDNSNamePrefix":"[concat(parameters('dnsNamePrefix'),'mgmt')]",
"orchestratorType":"[parameters('orchestratorType')]",
"sshRSAPublicKey":"[parameters('sshRSAPublicKey')]",
"servicePrincipalClientId": "[parameters('servicePrincipalClientId')]",
"servicePrincipalClientSecret": "[parameters('servicePrincipalClientSecret')]",
"useServicePrincipalDictionary": {
"DCOS": 0,
"Swarm": 0,
"Kubernetes": 1
},
"useServicePrincipal": "[variables('useServicePrincipalDictionary')[variables('orchestratorType')]]",
"servicePrincipalFields": [
null,
{
"ClientId": "[parameters('servicePrincipalClientId')]",
"Secret": "[parameters('servicePrincipalClientSecret')]"
}
]
},
"resources": [
{
"apiVersion": "2016-09-30",
"type": "Microsoft.ContainerService/containerServices",
"location": "[resourceGroup().location]",
"name":"[resourceGroup().name]",
"properties": {
"orchestratorProfile": {
"orchestratorType": "[variables('orchestratorType')]"
},
"masterProfile": {
"count": "[variables('masterCount')]",
"dnsPrefix": "[variables('mastersEndpointDNSNamePrefix')]"
},
"agentPoolProfiles": [
{
"name": "agentpools",
"count": "[variables('agentCount')]",
"vmSize": "[variables('agentVMSize')]",
"dnsPrefix": "[variables('agentsEndpointDNSNamePrefix')]"
}
],
"linuxProfile": {
"adminUsername": "[variables('adminUsername')]",
"ssh": {
"publicKeys": [
{
"keyData": "[variables('sshRSAPublicKey')]"
}
]
}
},
"servicePrincipalProfile": "[variables('servicePrincipalFields')[variables('useServicePrincipal')]]"
}
}
],
"outputs": {
"masterFQDN": {
"type": "string",
"value": "[reference(concat('Microsoft.ContainerService/containerServices/', resourceGroup().name)).masterProfile.fqdn]"
},
"sshMaster0": {
"type": "string",
"value": "[concat('ssh ', variables('adminUsername'), '#', reference(concat('Microsoft.ContainerService/containerServices/', resourceGroup().name)).masterProfile.fqdn, ' -A -p 22')]"
},
"agentFQDN": {
"type": "string",
"value": "[reference(concat('Microsoft.ContainerService/containerServices/', resourceGroup().name)).agentPoolProfiles[0].fqdn]"
}
}
}
This is a known service issue for old clusters. A fix is currently rolling out and is being tracked in this github issue, https://github.com/Azure/ACS/issues/16
Jack (a dev on the ACS team)
I had test in my lab with this template, but I can't reproduce your error.
please try to use azure resource explorer to edit the count of agent pool:

Resources