How to automate uploading AKS logs to Azure storage account - azure

I have a task to automate the uploading of AKS logs (control plane and workload) to the Azure storage account so that they can be viewed later or may be an alert notification to the email/teams channel in case of any failure. It would have been an easy task if the log analytics workspace would have been used however, to save the cost we have kept it disabled.
I have tried using the below cronjob which would upload the pod logs to storage account on a regular basis, but it is throwing me the below errors[1]
apiVersion: batch/v1
kind: CronJob
metadata:
name: log-uploader
spec:
schedule: "0 0 * * *" # Run every day at midnight
jobTemplate:
spec:
template:
spec:
containers:
- name: log-uploader
image: mcr.microsoft.com/azure-cli:latest
command:
- bash
- "-c"
- |
az aks install-cli
# Set environment variables for Azure Storage Account and Container
export AZURE_STORAGE_ACCOUNT=test-101
export AZURE_STORAGE_CONTAINER=logs-101
# Iterate over all pods in the cluster and upload their logs to Azure Blob Storage
for pod in $(kubectl get pods --all-namespaces -o jsonpath='{range .items[*]}{.metadata.name} {.metadata.namespace}{"\n"}{end}'); do
namespace=$(echo $pod | awk '{print $2}')
pod_name=$(echo $pod | awk '{print $1}')
# Use the Kubernetes logs API to retrieve the logs for the pod
logs=$(kubectl logs -n $namespace $pod_name)
# Use the Azure CLI to upload the logs to Azure Blob Storage
echo $logs | az storage blob upload --file - --account-name $AZURE_STORAGE_ACCOUNT --container-name $AZURE_STORAGE_CONTAINER --name "$namespace/$pod_name_`date`.log"
done
restartPolicy: OnFailure
Errors[1]
error: expected 'logs [-f] [-p] (POD | TYPE/NAME) [-c CONTAINER]'.
POD or TYPE/NAME is a required argument for the logs command
See 'kubectl logs -h' for help and examples
The same commands are running fine outside the container.
Any thoughts/suggestions would be highly appreciated.
Regards,
Piyush

A better approach for achieving this would be deploying a fluentd daemonset in your cluster and use the azure storage plugin to upload logs to a storage account.
This tool was built for this specific purpose and will probably serve you better for this purpose.

So, I have found a better approach to automate the export of aks logs to the azure storage account. I have used a tool called Vector (By DataDog). It is much easier to implement and it is lightweight than fluentd.
Vector not only exports the data in near real-time but you can perform a lot of transformation to the data before it is actually transported to the destination.
I have created an end-to-end video tutorial to implement this
Link to the video

Related

Azure CLI - Automation account replace-content is breaking on new line

I've got a very frustrating situation using the Azure CLI and attempting to replace the content of an Automation Account.
I am attempting to update it via a Azure DevOps pipeline AzureCLI#2 task. This is the script line i am calling
az automation runbook replace-content --debug --automation-account-name "${{ parameters.automationAccountName }}" --resource-group "${{ parameters.resourceGroup }}" --name "Schedule Summary" --content "`#temp.txt"
The issue i am having is the automation account runbook is updated, but the text is truncated. The contents of temp.txt is this -
Param(
[string]$resourceGroup ='',
[string]$UAMI ='',
[string]$serverInstance ='',
But the script that ends up in the runbook is simply
Param(
Its clearly breaking on CRLF but i can't figure out how to fix it. If i remove all CRLF then it appears as one line and the script then doesn't run.
I can tell where the problem is? Is the AzureCLI, powershell? or the devops task.
I've tried in my environment by adding Devops CLI extension in Azure bash and it worked for me successfully with the same parameters as yours.
I created a PS file in Az cloud itself and saved with the .ps1 extension as set the runbook type to PowerShell and updated the script as follows:
az automation runbook replace-content --content "#runbook.ps1" --automation-account-name "xxxxautomation" --name "xxxxrunbook" --resource-group "xxxxRG"
vi runbook.ps1:
Content replacement done in runbook:
If still the issue persists: In Azure DevOps, call a webhook with parameters and then start a runbook that imports the Azure DevOps runbooks.
But when you are dealing with Azure DevOps, I suggest you create or update runbooks via API instead of PowerShell modules which is efficient.
I run the script in windows hosted agent and reproduce your issue.Its clearly breaking on CRLF. Because windows can't identify the CRLF. You should run the script in Linux agent.
breaking on CRLF based on windows agent
alter to linux agent in pipeline
pool:
vmImage: ubuntu-22.04

how to get full yaml file form running azure AKS

I tried to pull yaml from my running AKS, which kubectl options I can run in order to pull full yaml file form running azure AKS , need it for only 1 AKS Name ?
full yaml file means - the same as if I go on azure portal and click and go inside of my AKS, then on the left pane I can click: "export template" , but it is in json, I need the same in yaml
In the following answer I have referenced Azure CLI commands. You can find installation instructions here.
If you want the managedCluster object of an AKS cluster in yaml format please run:
az aks show -g $ResourceGroupName -n $AKSClusterName -o yaml
If you want specific Kubernetes resource(s) in a yaml format
First run
az aks get-credentials -g $ResourceGroupName -n $AKSClusterName
to get the access credentials for the AKS cluster and merge them into the kubeconfig file.
Now you can run:
kubectl get $resource-type $resource-name -n $namespace -o yaml
Please replace $resource-type with the correct Kubernetes resource type (e.g. pod, node, deployment, service, replicaset, ingress etc.) and $resource-name with the corresponding desired resource name. If you want to get a list of all resources of $resource-type you can ignore specifying $resource-name. If you want to list resources of $resource-type in all namespaces please replace -n $namespace with --all-namespaces
For example, if you want to get the list of all pods in the namespace development in yaml format then, you should run:
kubectl get pods -n development -o yaml
References:
https://learn.microsoft.com/en-us/cli/azure/aks?view=azure-cli-latest
https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands

az storage container list. doesnt work, referencing deleted storage

I am following this tutorial, running az cli(v 2.11) on my MacOS locally:
https://learn.microsoft.com/en-us/learn/modules/provision-infrastructure-azure-pipelines/6-run-terraform-remote-storage
after following a few steps including this one:
az storage account create --name tfsa$UNIQUE_ID --resource-group tf-storage-rg --sku Standard_LRS
and have run this command:
az storage container list --query "[].{name:name}" --output tsv
i receive the following:
HTTPSConnectionPool(host='mystorageaccount20822.blob.core.windows.net', port=443): Max retries exceeded with url: /?comp=list&maxresults=5000 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10d2566a0>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))
The above command works in cloud shell, but fails in my local shell (running v 2.20, up to date)
on cloud shell i do get this warning though:
There are no credentials provided in your command and environment, we
will query for the account key inside your storage account. Please
provide --connection-string, --account-key or --sas-token as
credentials, or use --auth-mode login if you have required RBAC
roles inyour command. For more information about RBAC roles in
storage, visit
https://learn.microsoft.com/en-us/azure/storage/common/storage-auth-aad-rbac-cli.
I had previously created a mystorageaccount20822 a couple weeks ago but deleted it... my AZ CLI is still bound to this previous account? Is there a way to tell my az cli (on mac) to sync up with the current resources i have running. In Azure Portal mystorageaccount20822 does NOT exist.
Does Azure CLI cache some values or something? is there some hidden config file that has the old 'mystorageaccount20822' set and the CLI is trying to reference that each time instead of the new account named tfsa$UNIQUE_ID ?
After running the command with debug:
az storage container list --debug --account-name tfsa$UNIQUE_ID --query [].name --output tsv
I was able to see that it was setting it.
It turns out it had set the environment variable 'AZURE_STORAGE_CONNECTION_STRING' from a tutorial a few days ago, which was overriding a property when the command was sent, to use an old examples value. After unsetting that environment variable, the command worked.

The Resource 'Microsoft.Sql/servers/server/databases/ABC' under resource group 'xyz' was not found

While Copy a production database locally with AZ CLI it has been copied successfully. But when I am integrating it into the Azure DevOps I am getting the ERROR : The Resource 'Microsoft.Sql/servers/mi-tools/databases/ABC' under resource group 'xyz' was not found.
Here is the code that I need to execute in my Pipeline.
az sql db copy --subscription $(SubscriptionName) --dest-server $(ServerName) --name $(ProductionDatabaseName) --dest-name $(CopyDatabaseName) --resource-group $(ResourceGroupName) --server $(ServerName) -f Gen5 -c 4 --compute-model Serverless
While deleting a database through the Azure DevOps pipeline it will take some time to delete. And my next line copied the database So it will execute immediately and the database not completely deleted.
You have to do it within two-step in Azure DeOps.

Create Azure Service Principal for Service App using the CLI

I'm trying to create an Azure DevOps service endpoint to connect to Azure Resource Manager and to deploy my app into a App Service.
When I go to Azure DevOps > Project Properties and create a Service Endpoint using the UI (Automated dialog) it works fine and my app can be deployed to App Service from a yaml pipeline, BUT, when I try to replicate it thru the Azure CLI it doesn't work (the build fails to deploy complaining about the Service Principal).
This is my code:
az_account=$(az account show)
az_subscription_id=$(echo $az_account |jq -r '.id')
az_subscription_name=$(echo $az_account |jq -r '.name')
az_tenant_id=$(echo $az_account |jq -r '.tenantId')
az_service_principal=$(az ad sp create-for-rbac -n "my-app-service-principal")
az_service_principal_password=$(echo $az_service_principal|jq -r '.password')
az_service_principal_id=$(az ad sp list --all | jq -c '.[] | select( .appDisplayName | contains("my-app-service-principal"))'| jq -r '.objectId')
export AZURE_DEVOPS_EXT_AZURE_RM_SERVICE_PRINCIPAL_KEY=$az_service_principal_password
az devops service-endpoint azurerm create --azure-rm-service-principal-id $az_service_principal_id --azure-rm-subscription-id $az_subscription_id --azure-rm-subscription-name $az_subscription_name --azure-rm-tenant-id $az_tenant_id --name my-app-service-endpoint
How should I create this Service Enpoint programatically with the Azure CLI?
Updated with the Azure DevOps error:
Your script simply creates the Service Principal but it is not giving any permission to the SP.
I would add some lines like these to create a Resource Group and scope permission to it
az_service_principal_appid = $(echo $az_service_principal|jq -r '.appId')
az group create --name myrg --location westeurope
az role assignment create --role Contributor --assignee $az_service_principal_appid --resource-group myrg
Clearly you need to think how to arrange your resources and SPs: you may need many of both depending on your architecture.

Resources