How to send the AKS application logs to Log Analytics workspace? - azure

I have created an AKS cluster using the following Terraform code
resource "azurerm_virtual_network" "test" {
name = var.virtual_network_name
location = azurerm_resource_group.rg.location
resource_group_name = azurerm_resource_group.rg.name
address_space = [var.virtual_network_address_prefix]
subnet {
name = var.aks_subnet_name
address_prefix = var.aks_subnet_address_prefix
}
tags = var.tags
}
data "azurerm_subnet" "kubesubnet" {
name = var.aks_subnet_name
virtual_network_name = azurerm_virtual_network.test.name
resource_group_name = azurerm_resource_group.rg.name
depends_on = [azurerm_virtual_network.test]
}
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_name
location = azurerm_resource_group.rg.location
dns_prefix = var.aks_dns_prefix
resource_group_name = azurerm_resource_group.rg.name
http_application_routing_enabled = false
linux_profile {
admin_username = var.vm_user_name
ssh_key {
key_data = file(var.public_ssh_key_path)
}
}
default_node_pool {
name = "agentpool"
node_count = var.aks_agent_count
vm_size = var.aks_agent_vm_size
os_disk_size_gb = var.aks_agent_os_disk_size
vnet_subnet_id = data.azurerm_subnet.kubesubnet.id
}
service_principal {
client_id = local.client_id
client_secret = local.client_secret
}
network_profile {
network_plugin = "azure"
dns_service_ip = var.aks_dns_service_ip
docker_bridge_cidr = var.aks_docker_bridge_cidr
service_cidr = var.aks_service_cidr
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = var.azure_active_directory_role_based_access_control_managed
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = var.azure_rbac_enabled
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [data.azurerm_subnet.kubesubnet,module.log_analytics_workspace]
tags = var.tags
}
and followed the below steps and installed the ISTIO as per the ISTIO documentation
#Prerequisites
helm repo add istio https://istio-release.storage.googleapis.com/charts
helm repo update
#create namespace
kubectl create namespace istio-system
# helm install istio-base and istiod
helm install istio-base istio/base -n istio-system
helm install istiod istio/istiod -n istio-system --wait
# Check the installation status
helm status istiod -n istio-system
#create namespace and enable istio-injection for envoy proxy containers
kubectl create namespace istio-ingress
kubectl label namespace istio-ingress istio-injection=enabled
## helm install istio-ingress for traffic management
helm install istio-ingress istio/gateway -n istio-ingress --wait
## Mark the default namespace as istio-injection=enabled
kubectl label namespace default istio-injection=enabled
## Install the App and Gateway
kubectl apply -f bookinfo.yaml
kubectl apply -f bookinfo-gateway.yaml
and I could access the application like shown below
This application prints the logs in the console.
I have created the Log Analytics workspace as mentioned below
# Create Log Analytics Workspace
module "log_analytics_workspace" {
source = "./modules/log_analytics_workspace"
count = var.enable_log_analytics_workspace == true ? 1 : 0
app_or_service_name = "log"
subscription_type = var.subscription_type
environment = var.environment
resource_group_name = azurerm_resource_group.rg.name
location = var.location
instance_number = var.instance_number
sku = var.log_analytics_workspace_sku
retention_in_days = var.log_analytics_workspace_retention_in_days
tags = var.tags
}
and set the log_anaglytics_workspace_id in the oms_agent section in azurerm_kubernetes_cluster to collect logs and ship to Log Analytics.
Is that all needed to ship the application logs to Log Analytics Workspace? somehow I don't see the logs flowing? am I missing something?

Related

AKS Service deployment done and canĀ“t reach external IPs

I've created a Kubernetes cluster in Azure using the following Terraform
# Locals block for hardcoded names
locals {
backend_address_pool_name = "appgateway-beap"
frontend_port_name = "appgateway-feport"
frontend_ip_configuration_name = "appgateway-feip"
http_setting_name = "appgateway-be-htst"
listener_name = "appgateway-httplstn"
request_routing_rule_name = "appgateway-rqrt"
app_gateway_subnet_name = "appgateway-subnet"
}
data "azurerm_subnet" "aks-subnet" {
name = "aks-subnet"
virtual_network_name = "np-dat-spoke-vnet"
resource_group_name = "ipz12-dat-np-connect-rg"
}
data "azurerm_subnet" "appgateway-subnet" {
name = "appgateway-subnet"
virtual_network_name = "np-dat-spoke-vnet"
resource_group_name = "ipz12-dat-np-connect-rg"
}
# Create Resource Group for Kubernetes Cluster
module "resource_group_kubernetes_cluster" {
source = "./modules/resource_group"
count = var.enable_kubernetes == true ? 1 : 0
#name_override = "rg-aks-spoke-dev-westus3-001"
app_or_service_name = "aks" # var.app_or_service_name
subscription_type = var.subscription_type # "spoke"
environment = var.environment # "dev"
location = var.location # "westus3"
instance_number = var.instance_number # "001"
tags = var.tags
}
resource "azurerm_user_assigned_identity" "identity_uami" {
location = var.location
name = "appgw-uami"
resource_group_name = module.resource_group_kubernetes_cluster[0].name
}
# Application Gateway Public Ip
resource "azurerm_public_ip" "test" {
name = "publicIp1"
location = var.location
resource_group_name = module.resource_group_kubernetes_cluster[0].name
allocation_method = "Static"
sku = "Standard"
}
resource "azurerm_application_gateway" "network" {
name = var.app_gateway_name
resource_group_name = module.resource_group_kubernetes_cluster[0].name
location = var.location
sku {
name = var.app_gateway_sku
tier = "Standard_v2"
capacity = 2
}
identity {
type = "UserAssigned"
identity_ids = [
azurerm_user_assigned_identity.identity_uami.id
]
}
gateway_ip_configuration {
name = "appGatewayIpConfig"
subnet_id = data.azurerm_subnet.appgateway-subnet.id
}
frontend_port {
name = local.frontend_port_name
port = 80
}
frontend_port {
name = "httpsPort"
port = 443
}
frontend_ip_configuration {
name = local.frontend_ip_configuration_name
public_ip_address_id = azurerm_public_ip.test.id
}
backend_address_pool {
name = local.backend_address_pool_name
}
backend_http_settings {
name = local.http_setting_name
cookie_based_affinity = "Disabled"
port = 80
protocol = "Http"
request_timeout = 1
}
http_listener {
name = local.listener_name
frontend_ip_configuration_name = local.frontend_ip_configuration_name
frontend_port_name = local.frontend_port_name
protocol = "Http"
}
request_routing_rule {
name = local.request_routing_rule_name
rule_type = "Basic"
http_listener_name = local.listener_name
backend_address_pool_name = local.backend_address_pool_name
backend_http_settings_name = local.http_setting_name
priority = 100
}
tags = var.tags
depends_on = [azurerm_public_ip.test]
lifecycle {
ignore_changes = [
backend_address_pool,
backend_http_settings,
request_routing_rule,
http_listener,
probe,
tags,
frontend_port
]
}
}
# Create the Azure Kubernetes Service (AKS) Cluster
resource "azurerm_kubernetes_cluster" "kubernetes_cluster" {
count = var.enable_kubernetes == true ? 1 : 0
name = "aks-prjx-${var.subscription_type}-${var.environment}-${var.location}-${var.instance_number}"
location = var.location
resource_group_name = module.resource_group_kubernetes_cluster[0].name # "rg-aks-spoke-dev-westus3-001"
dns_prefix = "dns-aks-prjx-${var.subscription_type}-${var.environment}-${var.location}-${var.instance_number}" #"dns-prjxcluster"
private_cluster_enabled = false
local_account_disabled = true
default_node_pool {
name = "npprjx${var.subscription_type}" #"prjxsyspool" # NOTE: "name must start with a lowercase letter, have max length of 12, and only have characters a-z0-9."
vm_size = "Standard_B8ms"
vnet_subnet_id = data.azurerm_subnet.aks-subnet.id
# zones = ["1", "2", "3"]
enable_auto_scaling = true
max_count = 3
min_count = 1
# node_count = 3
os_disk_size_gb = 50
type = "VirtualMachineScaleSets"
enable_node_public_ip = false
enable_host_encryption = false
node_labels = {
"node_pool_type" = "npprjx${var.subscription_type}"
"node_pool_os" = "linux"
"environment" = "${var.environment}"
"app" = "prjx_${var.subscription_type}_app"
}
tags = var.tags
}
ingress_application_gateway {
gateway_id = azurerm_application_gateway.network.id
}
# Enabled the cluster configuration to the Azure kubernets with RBAC
azure_active_directory_role_based_access_control {
managed = true
admin_group_object_ids = var.active_directory_role_based_access_control_admin_group_object_ids
azure_rbac_enabled = true #false
}
network_profile {
network_plugin = "azure"
network_policy = "azure"
outbound_type = "userDefinedRouting"
}
identity {
type = "SystemAssigned"
}
oms_agent {
log_analytics_workspace_id = module.log_analytics_workspace[0].id
}
timeouts {
create = "20m"
delete = "20m"
}
depends_on = [
azurerm_application_gateway.network
]
}
and provided the necessary permissions
# Get the AKS Agent Pool SystemAssigned Identity
data "azurerm_user_assigned_identity" "aks-identity" {
name = "${azurerm_kubernetes_cluster.kubernetes_cluster[0].name}-agentpool"
resource_group_name = "MC_${module.resource_group_kubernetes_cluster[0].name}_aks-prjx-spoke-dev-eastus-001_eastus"
}
# Get the AKS SystemAssigned Identity
data "azuread_service_principal" "aks-sp" {
display_name = azurerm_kubernetes_cluster.kubernetes_cluster[0].name
}
# Provide ACR Pull permission to AKS SystemAssigned Identity
resource "azurerm_role_assignment" "acrpull_role" {
scope = module.container_registry[0].id
role_definition_name = "AcrPull"
principal_id = data.azurerm_user_assigned_identity.aks-identity.principal_id
skip_service_principal_aad_check = true
depends_on = [
data.azurerm_user_assigned_identity.aks-identity
]
}
resource "azurerm_role_assignment" "aks_id_network_contributor_subnet" {
scope = data.azurerm_subnet.aks-subnet.id
role_definition_name = "Network Contributor"
principal_id = data.azurerm_user_assigned_identity.aks-identity.principal_id
depends_on = [data.azurerm_user_assigned_identity.aks-identity]
}
resource "azurerm_role_assignment" "akssp_network_contributor_subnet" {
scope = data.azurerm_subnet.aks-subnet.id
role_definition_name = "Network Contributor"
principal_id = data.azuread_service_principal.aks-sp.object_id
depends_on = [data.azuread_service_principal.aks-sp]
}
resource "azurerm_role_assignment" "aks_id_contributor_agw" {
scope = data.azurerm_subnet.appgateway-subnet.id
role_definition_name = "Network Contributor"
principal_id = data.azurerm_user_assigned_identity.aks-identity.principal_id
depends_on = [data.azurerm_user_assigned_identity.aks-identity]
}
resource "azurerm_role_assignment" "akssp_contributor_agw" {
scope = data.azurerm_subnet.appgateway-subnet.id
role_definition_name = "Network Contributor"
principal_id = data.azuread_service_principal.aks-sp.object_id
depends_on = [data.azuread_service_principal.aks-sp]
}
resource "azurerm_role_assignment" "aks_ingressid_contributor_on_agw" {
scope = azurerm_application_gateway.network.id
role_definition_name = "Contributor"
principal_id = azurerm_kubernetes_cluster.kubernetes_cluster[0].ingress_application_gateway[0].ingress_application_gateway_identity[0].object_id
depends_on = [azurerm_application_gateway.network,azurerm_kubernetes_cluster.kubernetes_cluster]
skip_service_principal_aad_check = true
}
resource "azurerm_role_assignment" "aks_ingressid_contributor_on_uami" {
scope = azurerm_user_assigned_identity.identity_uami.id
role_definition_name = "Contributor"
principal_id = azurerm_kubernetes_cluster.kubernetes_cluster[0].ingress_application_gateway[0].ingress_application_gateway_identity[0].object_id
depends_on = [azurerm_application_gateway.network,azurerm_kubernetes_cluster.kubernetes_cluster]
skip_service_principal_aad_check = true
}
resource "azurerm_role_assignment" "uami_contributor_on_agw" {
scope = azurerm_application_gateway.network.id
role_definition_name = "Contributor"
principal_id = azurerm_user_assigned_identity.identity_uami.principal_id
depends_on = [azurerm_application_gateway.network,azurerm_user_assigned_identity.identity_uami]
skip_service_principal_aad_check = true
}
and deployed the below mentioned application
apiVersion: apps/v1
kind: Deployment
metadata:
name: aks-helloworld
spec:
replicas: 1
selector:
matchLabels:
app: aks-helloworld-two
template:
metadata:
labels:
app: aks-helloworld-two
spec:
containers:
- name: aks-helloworld-two
image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
ports:
- containerPort: 80
env:
- name: TITLE
value: "AKS Ingress Demo"
---
apiVersion: v1
kind: Service
metadata:
name: aks-helloworld
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: aks-helloworld-two
External IP got assigned
however I am not able to access the External IP
Note: I have not deployed any Ingress controller separately like mentioned in the Microsoft Article as I am not sure this is required
I tried to reproduce the same in my environment to create Kubernetes Service Cluster with Application Gateway:
Follow the Stack link to create Kubernetes Service Cluster with Ingress Application Gateway.
If you are unable to access your application using external Load balancer IP after deployment in Azure Kubernetes Service (AKS), Verify the below setting in AKS Cluster.
1.check the status of the load balancer using the below cmd.
kubectl get service <your service name>
Make sure that the External -IP field is not set to Pending state.
Verify the security group associated with the load balancer. Make sure that the security group allows traffic on the desired port.
Kindly follow the below steps to check NSG Security rules in AKS cluster.
Go to Azure Portal > Kubernetes services > Select your Kubernetes services> Properties > Select your resource group under Infrastructure resource group > overview > Select your NSG Group.
I have disabled inbound http rule in Network Security Group for testing, got the same error.
Application status, once disable the Port 80 in NSG.
check the routing rules on your virtual network. Make sure that traffic is being forwarded from the load balancer.

Terraform Azure VM doesn't recreate the instance even after changing custom_data

I am trying to provision an an azure virtual machine and I need to install kubectl on it. So I am using a bash script and pass it to the VM's custom_data section.
Everything works fine - it provisions the VM and install the kubectl.
But the problem is, if I do some modifications to the the bash script, and do terraform apply, it show no changes as it doesn't detect that I did the changes to the bash script.
Here's my code snippet. Can someone help me understand this?
locals {
admin_username = "myjumphost"
admin_password = "mypassword!610542"
}
resource "azurerm_virtual_machine" "vm" {
name = var.vm_name
location = var.location
resource_group_name = var.rg_name
network_interface_ids = var.nic_id
vm_size = var.vm_size
delete_os_disk_on_termination = true
delete_data_disks_on_termination = true
storage_image_reference {
publisher = var.storage_image_reference.publisher
offer = var.storage_image_reference.offer
sku = var.storage_image_reference.sku
version = var.storage_image_reference.version
}
storage_os_disk {
name = var.storage_os_disk.name
caching = var.storage_os_disk.caching
create_option = var.storage_os_disk.create_option
managed_disk_type = var.storage_os_disk.managed_disk_type
}
os_profile {
computer_name = var.vm_name
admin_username = local.admin_username
admin_password = local.admin_password
custom_data = file("${path.module}/${var.custom_data_script}")
}
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
key_data = file("${path.module}/${var.ssh_public_key}")
path = "/home/${local.admin_username}/.ssh/authorized_keys"
}
}
tags = merge(var.common_tags)
}
any my script install.sh
#!/bin/bash
# install Kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client --output=yaml > /tmp/kubectl_version.yaml
# Install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
Try the refresh command; it was backwards compatible. Usually, it won't be necessary because the plan will execute the same refresh. Most confusion will be caused by bash or PowerShell scripts that are run as cloud-init script on backed and implement the functionality; we can cross-check that in activity logs.
terraform apply -refresh-only -auto-approve
I have tried to replicate the same with below mentioned code base
main tf file as follows
resource "azurerm_resource_group" "example" {
name = "v-swarna-mindtree"
location = "Germany West Central"
}
data "azuread_client_config" "current" {}
resource "azurerm_virtual_network" "puvnet" {
name = "Public_VNET"
resource_group_name = azurerm_resource_group.example.name
location = "Germany West Central"
address_space = ["10.19.0.0/16"]
dns_servers = ["10.19.0.4", "10.19.0.5"]
}
resource "azurerm_subnet" "osubnet" {
name = "Outer_Subnet"
resource_group_name = azurerm_resource_group.example.name
address_prefixes = ["10.19.1.0/24"]
virtual_network_name = azurerm_virtual_network.puvnet.name
}
resource "azurerm_network_interface" "main" {
name = "testdemo"
location = "Germany West Central"
resource_group_name = azurerm_resource_group.example.name
ip_configuration {
name = "testconfiguration1"
subnet_id = azurerm_subnet.osubnet.id
private_ip_address_allocation = "Dynamic"
}
}
resource "azurerm_virtual_machine" "main" {
name = "vmjumphost"
location = "Germany West Central"
resource_group_name = azurerm_resource_group.example.name
network_interface_ids = [azurerm_network_interface.main.id]
//vm_size = "Standard_A1_v2"
vm_size ="Standard_DS2_v2"
storage_image_reference {
offer = "0001-com-ubuntu-server-focal"
publisher = "Canonical"
sku = "20_04-lts-gen2"
version = "latest"
}
storage_os_disk {
name = "myosdisk2"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
}
os_profile {
computer_name = "vm-swarnademo"
admin_username = "testadmin"
admin_password = "Password1234!"
// custom_data = file("${path.module}/${var.custom_data_script}")
custom_data = file("install.sh")
}
os_profile_linux_config {
disable_password_authentication = false
# ssh_keys {
# key_data = file("${path.module}/${var.ssh_public_key}")
# path = "/home/${local.admin_username}/.ssh/authorized_keys"
# }
}
tags = {
environment = "staging"
}
}
install sh file as follows
#!/bin/bash
# install Kubectl
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
kubectl version --client --output=yaml > /tmp/kubectl_version.yaml
#testing by adding command -start
#sudo apt-get -y update
#testing by adding command -End
#Install Azure CLI
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
Trying to update the script with below line of command
#testing by adding command -start
#sudo apt-get -y update
#testing by adding command -End
Step1:
while implement plan and apply, it was created all resources on portal.
Step2:
Updated the script by enable the code base
#testing by adding command -start
sudo apt-get -y update
#testing by adding command -End
upon running plan and apply, it will refresh the state of the virtual machine and all the changes
Verification from Activity Log:

Helm package deployment using Terraform

I have setup my Azure Kubernetes Cluster using Terraform and it working good.
I trying to deploy packages using Helm but not able to deploy getting below error.
Error: chart "stable/nginx-ingress" not found in https://kubernetes-charts.storage.googleapis.com repository
Note: I tried other packages as well my not able to deploy using "Terraform Resource" below is Terraform code. I tried local helm package using helm command and it works. I think the issue with Terraform helm resources. "nginx" is a sample package not able to deploy any package using Terraform.
resource "azurerm_kubernetes_cluster" "k8s" {
name = var.aks_cluster_name
location = var.location
resource_group_name = var.resource_group_name
dns_prefix = var.aks_dns_prefix
kubernetes_version = "1.19.0"
# private_cluster_enabled = true
linux_profile {
admin_username = var.aks_admin_username
ssh_key {
key_data = var.aks_ssh_public_key
}
}
default_node_pool {
name = var.aks_node_pool_name
enable_auto_scaling = true
node_count = var.aks_agent_count
min_count = var.aks_min_agent_count
max_count = var.aks_max_agent_count
vm_size = var.aks_node_pool_vm_size
}
service_principal {
client_id = var.client_id
client_secret = var.client_secret
}
# tags = data.azurerm_resource_group.rg.tags
}
provider "helm" {
version = "1.3.2"
kubernetes {
host = azurerm_kubernetes_cluster.k8s.kube_config[0].host
client_key = base64decode(azurerm_kubernetes_cluster.k8s.kube_config[0].client_key)
client_certificate = base64decode(azurerm_kubernetes_cluster.k8s.kube_config[0].client_certificate)
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.k8s.kube_config[0].cluster_ca_certificate)
load_config_file = false
}
}
resource "helm_release" "nginx-ingress" {
name = "nginx-ingress-internal"
repository = "https://kubernetes-charts.storage.googleapis.com"
chart = "stable/nginx-ingress"
set {
name = "rbac.create"
value = "true"
}
}
You should skip stable in the chart name: it is a repository name but you have no helm repositories defined. Your resource should look like:
resource "helm_release" "nginx-ingress" {
name = "nginx-ingress-internal"
repository = "https://kubernetes-charts.storage.googleapis.com"
chart = "nginx-ingress"
...
}
which is an equivalent to the helm command:
helm install nginx-ingress-internal nginx-ingress --repo https://kubernetes-charts.storage.googleapis.com
Alternatively you can define repositories with the repository_config_path.

Terraform - AKS Private Cloud | Infinite wait on helm relase

I am trying to create a Private Cloud on AKS with Terraform.
The public route seemd to work fine and I am putting in security stuff, step by step
After putting in Networks azurerm_virtual_network, azurerm_subnet it seems to hand my Helm Deployment
There are no logs, its just an infinite wait.
helm_release.ingress: Still creating... [11m0s elapsed] (this is a simple NGINX Ingress Controller)
resource "azurerm_virtual_network" "foo_network" {
name = "${var.prefix}-network"
location = azurerm_resource_group.foo_group.location
resource_group_name = azurerm_resource_group.foo_group.name
address_space = ["10.1.0.0/16"]
}
resource "azurerm_subnet" "internal" {
name = "internal"
virtual_network_name = azurerm_virtual_network.foo_network.name
resource_group_name = azurerm_resource_group.foo_group.name
address_prefixes = ["10.1.0.0/22"]
}
Any points on how should I debug this? Lack of logs is making it difficult to understand.
Complete Script
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "foo" {
name = "${var.prefix}-k8s-resources"
location = var.location
}
resource "azurerm_kubernetes_cluster" "foo" {
name = "${var.prefix}-k8s"
location = azurerm_resource_group.foo.location
resource_group_name = azurerm_resource_group.foo.name
dns_prefix = "${var.prefix}-k8s"
default_node_pool {
name = "system"
node_count = 1
vm_size = "Standard_D4s_v3"
}
identity {
type = "SystemAssigned"
}
addon_profile {
aci_connector_linux {
enabled = false
}
azure_policy {
enabled = false
}
http_application_routing {
enabled = false
}
kube_dashboard {
enabled = true
}
oms_agent {
enabled = false
}
}
}
provider "kubernetes" {
version = "~> 1.11.3"
load_config_file = false
host = azurerm_kubernetes_cluster.foo.kube_config.0.host
username = azurerm_kubernetes_cluster.foo.kube_config.0.username
password = azurerm_kubernetes_cluster.foo.kube_config.0.password
cluster_ca_certificate = base64decode(azurerm_kubernetes_cluster.foo.kube_config.0.cluster_ca_certificate)
}
provider "helm" {
# Use provider with Helm 3.x support
version = "~> 1.2.2"
}
resource "null_resource" "configure_kubectl" {
provisioner "local-exec" {
command = "az aks get-credentials --resource-group ${azurerm_resource_group.foo.name} --name ${azurerm_kubernetes_cluster.foo.name} --overwrite-existing"
environment = {
KUBECONFIG = ""
}
}
depends_on = [azurerm_kubernetes_cluster.foo]
}
resource "helm_release" "ingress" {
name = "ingress-foo"
repository = "https://kubernetes.github.io/ingress-nginx"
chart = "ingress-nginx"
timeout = 3000
depends_on = [null_resource.configure_kubectl]
}
The best way to debug this is to be able to kubectl into the AKS cluster. (AKS should have documentation on how to set up kubectl.)
Then, play around with kubectl get pods -A and see if anything jumps out as being wrong. Specifically, look for nginx-ingress pods that are not in a Running status.
If there are such pods, debug further with kubectl describe pod <pod_name> or kubectl logs -f <pod_name>, depending on whether the issue happens after the container has successfully started up or not.

SSH connection to Azure VM with Terraform

I have successfully created a VM as part of a Resource Group on Azure using Terraform. Next step is to ssh in the new machine and run a few commands. For that, I have created a provisioner as part of the VM resource and set up an SSH connection:
resource "azurerm_virtual_machine" "helloterraformvm" {
name = "terraformvm"
location = "West US"
resource_group_name = "${azurerm_resource_group.helloterraform.name}"
network_interface_ids = ["${azurerm_network_interface.helloterraformnic.id}"]
vm_size = "Standard_A0"
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "14.04.2-LTS"
version = "latest"
}
os_profile {
computer_name = "hostname"
user = "some_user"
password = "some_password"
}
os_profile_linux_config {
disable_password_authentication = false
}
provisioner "remote-exec" {
inline = [
"sudo apt-get install docker.io -y"
]
connection {
type = "ssh"
user = "some_user"
password = "some_password"
}
}
}
If I run "terraform apply", it seems to get into an infinite loop trying to ssh unsuccessfully, repeating this log over and over:
azurerm_virtual_machine.helloterraformvm (remote-exec): Connecting to remote host via SSH...
azurerm_virtual_machine.helloterraformvm (remote-exec): Host:
azurerm_virtual_machine.helloterraformvm (remote-exec): User: testadmin
azurerm_virtual_machine.helloterraformvm (remote-exec): Password: true
azurerm_virtual_machine.helloterraformvm (remote-exec): Private key: false
azurerm_virtual_machine.helloterraformvm (remote-exec): SSH Agent: true
I'm sure I'm doing something wrong, but I don't know what it is :(
EDIT:
I have tried setting up this machine without the provisioner, and I can SSH to it no problems with the given username/passwd. However I need to look up the host name in the Azure portal because I don't know how to retrieve it from Terraform. It's suspicious that the "Host:" line in the log is empty, so I wonder if it has anything to do with that?
UPDATE:
I've tried with different things like indicating the host name in the connection with
host = "${azurerm_public_ip.helloterraformip.id}"
and
host = "${azurerm_public_ip.helloterraformips.ip_address}"
as indicated in the docs, but with no success.
I've also tried using ssh-keys instead of password, but same result - infinite loop of connection tries, with no clear error message as of why it's not connecting.
I have managed to make this work. I changed several things:
Gave name of host to connection.
Configured SSH keys properly - they need to be unencrypted.
Took the connection element out of the provisioner element.
Here's the full working Terraform file, replacing the data like SSH keys, etc.:
# Configure Azure provider
provider "azurerm" {
subscription_id = "${var.azure_subscription_id}"
client_id = "${var.azure_client_id}"
client_secret = "${var.azure_client_secret}"
tenant_id = "${var.azure_tenant_id}"
}
# create a resource group if it doesn't exist
resource "azurerm_resource_group" "rg" {
name = "sometestrg"
location = "ukwest"
}
# create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "tfvnet"
address_space = ["10.0.0.0/16"]
location = "ukwest"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
# create subnet
resource "azurerm_subnet" "subnet" {
name = "tfsub"
resource_group_name = "${azurerm_resource_group.rg.name}"
virtual_network_name = "${azurerm_virtual_network.vnet.name}"
address_prefix = "10.0.2.0/24"
#network_security_group_id = "${azurerm_network_security_group.nsg.id}"
}
# create public IPs
resource "azurerm_public_ip" "ip" {
name = "tfip"
location = "ukwest"
resource_group_name = "${azurerm_resource_group.rg.name}"
public_ip_address_allocation = "dynamic"
domain_name_label = "sometestdn"
tags {
environment = "staging"
}
}
# create network interface
resource "azurerm_network_interface" "ni" {
name = "tfni"
location = "ukwest"
resource_group_name = "${azurerm_resource_group.rg.name}"
ip_configuration {
name = "ipconfiguration"
subnet_id = "${azurerm_subnet.subnet.id}"
private_ip_address_allocation = "static"
private_ip_address = "10.0.2.5"
public_ip_address_id = "${azurerm_public_ip.ip.id}"
}
}
# create storage account
resource "azurerm_storage_account" "storage" {
name = "someteststorage"
resource_group_name = "${azurerm_resource_group.rg.name}"
location = "ukwest"
account_type = "Standard_LRS"
tags {
environment = "staging"
}
}
# create storage container
resource "azurerm_storage_container" "storagecont" {
name = "vhd"
resource_group_name = "${azurerm_resource_group.rg.name}"
storage_account_name = "${azurerm_storage_account.storage.name}"
container_access_type = "private"
depends_on = ["azurerm_storage_account.storage"]
}
# create virtual machine
resource "azurerm_virtual_machine" "vm" {
name = "sometestvm"
location = "ukwest"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_interface_ids = ["${azurerm_network_interface.ni.id}"]
vm_size = "Standard_A0"
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04-LTS"
version = "latest"
}
storage_os_disk {
name = "myosdisk"
vhd_uri = "${azurerm_storage_account.storage.primary_blob_endpoint}${azurerm_storage_container.storagecont.name}/myosdisk.vhd"
caching = "ReadWrite"
create_option = "FromImage"
}
os_profile {
computer_name = "testhost"
admin_username = "testuser"
admin_password = "Password123"
}
os_profile_linux_config {
disable_password_authentication = false
ssh_keys = [{
path = "/home/testuser/.ssh/authorized_keys"
key_data = "ssh-rsa xxx email#something.com"
}]
}
connection {
host = "sometestdn.ukwest.cloudapp.azure.com"
user = "testuser"
type = "ssh"
private_key = "${file("~/.ssh/id_rsa_unencrypted")}"
timeout = "1m"
agent = true
}
provisioner "remote-exec" {
inline = [
"sudo apt-get update",
"sudo apt-get install docker.io -y",
"git clone https://github.com/somepublicrepo.git",
"cd Docker-sample",
"sudo docker build -t mywebapp .",
"sudo docker run -d -p 5000:5000 mywebapp"
]
}
tags {
environment = "staging"
}
}
According to your description, Azure Custom Script Extension is an option for you.
The Custom Script Extension downloads and executes scripts on Azure
virtual machines. This extension is useful for post deployment
configuration, software installation, or any other configuration /
management task.
Remove provisioner "remote-exec" instead of below:
resource "azurerm_virtual_machine_extension" "helloterraformvm" {
name = "hostname"
location = "West US"
resource_group_name = "${azurerm_resource_group.helloterraformvm.name}"
virtual_machine_name = "${azurerm_virtual_machine.helloterraformvm.name}"
publisher = "Microsoft.OSTCExtensions"
type = "CustomScriptForLinux"
type_handler_version = "1.2"
settings = <<SETTINGS
{
"commandToExecute": "apt-get install docker.io -y"
}
SETTINGS
}
Note: Command is executed by root user, don't use sudo.
More information please refer to this link: azurerm_virtual_machine_extension.
For a list of possible extensions, you can use the Azure CLI command az vm extension image list -o table
Update: The above example only supports single command. If you need to multiple commands. Like install docker on your VM, you need
apt-get update
apt-get install docker.io -y
Save it as a file named script.sh and save it to Azure Storage account or GitHub(The file should be public). Modify terraform file like below:
settings = <<SETTINGS
{
"fileUris": ["https://gist.githubusercontent.com/Walter-Shui/dedb53f71da126a179544c91d267cdce/raw/bb3e4d90e3291530570eca6f4ff7981fdcab695c/script.sh"],
"commandToExecute": "sh script.sh"
}
SETTINGS

Resources