FluXCD Helm deployment from Azure ACR - no chart name found error - azure

I am attempting to deploy a Helm chart to AKS using FluxCD. The chart has been pushed to Azure ACR using the Helm cli - "helm push ...". The chart is declared in the ACR as helm/release-services:0.1.0
I am receiving the following error after a Flux reconcile:
'chart pull error: failed to get chart version for remote reference:
no chart name found'
with helm-controller logs as follows
{"level":"info","ts":"2022-02-07T12:40:18.121Z","logger":"controller.helmrelease","msg":"HelmChart 'flux-system/release-services-test-release-services' is not ready","reconciler group":"helm.toolkit.fluxcd.io","reconciler kind":"HelmRelease","name":"release-services","namespace":"release-services-test"}
{"level":"info","ts":"2022-02-07T12:40:18.135Z","logger":"controller.helmrelease","msg":"reconcilation finished in 15.458307ms, next run in 5m0s","reconciler group":"helm.toolkit.fluxcd.io","reconciler kind":"HelmRelease","name":"release-services","namespace":"release-services-test"}
Below is the HelmChart resource in AKS:
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmChart
metadata:
creationTimestamp: "2022-02-07T07:30:16Z"
finalizers:
- finalizers.fluxcd.io
generation: 1
name: release-services-test-release-services
namespace: flux-system
resourceVersion: "105266699"
selfLink: /apis/source.toolkit.fluxcd.io/v1beta1/namespaces/flux-system/helmcharts/release-services-test-release-services
uid: e4820a70-8885-44a1-8dfd-0e2bf7256915
spec:
chart: release-services
interval: 5m0s
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: psbombb-helm-acr-dev
version: '>=0.1.0'
status:
conditions:
- lastTransitionTime: "2022-02-07T11:02:49Z"
message: 'chart pull error: failed to get chart version for remote reference:
no chart name found'
reason: ChartPullFailed
status: "False"
type: Ready
observedGeneration: 1
and the HelmRelease is as follows
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
creationTimestamp: "2022-02-07T04:34:14Z"
finalizers:
- finalizers.fluxcd.io
generation: 9
labels:
kustomize.toolkit.fluxcd.io/name: apps
kustomize.toolkit.fluxcd.io/namespace: flux-system
name: release-services
namespace: release-services-test
resourceVersion: "105341484"
selfLink: /apis/helm.toolkit.fluxcd.io/v2beta1/namespaces/release-services-test/helmreleases/release-services
uid: 6a6e5f5c-951d-4655-9c15-fa9fe7421a04
spec:
chart:
spec:
chart: release-services
reconcileStrategy: ChartVersion
sourceRef:
kind: HelmRepository
name: psbombb-helm-acr-dev
namespace: flux-system
version: '>=0.1.0'
install:
remediation:
retries: 3
interval: 5m
releaseName: release-services
timeout: 12m
values:
image:
name: release-services
pullPolicy: IfNotPresent
registry: <repository>.azurecr.io
repository: <repository>.azurecr.io/helm/release-services
tag: 0.1.0
postgres:
secret:
create: false
existingName: release-services-secrets
status:
conditions:
- lastTransitionTime: "2022-02-07T08:27:13Z"
message: HelmChart 'flux-system/release-services-test-release-services' is not
ready
reason: ArtifactFailed
status: "False"
type: Ready
failures: 50
helmChart: flux-system/release-services-test-release-services
observedGeneration: 9
Is there anything I am missing that anyone can spot for me please?
Thank you kindly

I think your issue is that the Azure Container Registry stores Helm Charts as OCI Artifacts.
The Flux source controller will pull the index.yaml from a HTTP Helm Chart repo to look for tags and this is not working with an OCI registry.
Here is the GitHub issue for this were you can see that the Flux guys will work on this as of now the OCI Feature is stable with Helm 3.8.0.

Related

Event Hub triggered Azure Function running on AKS with KEDA does not scale out

I have deployed an Event Hub triggered Azure Function written in Java on AKS. The function should scale out using KEDA.
The function is correctly triggerd and working but it's not scaling out when the load increases. I have added sleep calls to the function implementation to make sure it's not burning through the events too fast and should be forced to scale out but this did not show any change as well.
kubectl get hpa shows the following output
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
keda-hpa-eventlogger Deployment/eventlogger 64/64 (avg) 1 20 1 3m41s
This seems to be a first indicator that something is not correct as i assume the first number in the targets column is the number of unprocessed events in event hub. This stays the same no matter how many events i pump into the hub.
The Function was deployed using the following Kubernetes Deployment Manifest
data:
AzureWebJobsStorage: <removed>
FUNCTIONS_WORKER_RUNTIME: amF2YQ==
EventHubConnectionString: <removed>
apiVersion: v1
kind: Secret
metadata:
name: eventlogger
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: eventlogger
labels:
app: eventlogger
spec:
selector:
matchLabels:
app: eventlogger
template:
metadata:
labels:
app: eventlogger
spec:
containers:
- name: eventlogger
image: <removed>
env:
- name: AzureFunctionsJobHost__functions__0
value: eventloggerHandler
envFrom:
- secretRef:
name: eventlogger
readinessProbe:
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 240
httpGet:
path: /
port: 80
scheme: HTTP
startupProbe:
failureThreshold: 3
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 240
httpGet:
path: /
port: 80
scheme: HTTP
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: eventlogger
labels:
app: eventlogger
spec:
scaleTargetRef:
name: eventlogger
pollingInterval: 5
cooldownPeriod: 5
minReplicaCount: 0
maxReplicaCount: 20
triggers:
- type: azure-eventhub
metadata:
storageConnectionFromEnv: AzureWebJobsStorage
connectionFromEnv: EventHubConnectionString
---
The Connection String of Event Hub contains the "EntityPath=" Section as described in the KEDA Event Hub Scaler Documentation and has Manage-Permissions on the Event Hub Namespace.
The output of kubectl describe ScaledObject is
Name: eventlogger
Namespace: default
Labels: app=eventlogger
scaledobject.keda.sh/name=eventlogger
Annotations: <none>
API Version: keda.sh/v1alpha1
Kind: ScaledObject
Metadata:
Creation Timestamp: 2022-04-17T10:30:36Z
Finalizers:
finalizer.keda.sh
Generation: 1
Managed Fields:
API Version: keda.sh/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.:
f:kubectl.kubernetes.io/last-applied-configuration:
f:labels:
.:
f:app:
f:spec:
.:
f:cooldownPeriod:
f:maxReplicaCount:
f:minReplicaCount:
f:pollingInterval:
f:scaleTargetRef:
.:
f:name:
f:triggers:
Manager: kubectl-client-side-apply
Operation: Update
Time: 2022-04-17T10:30:36Z
API Version: keda.sh/v1alpha1
Fields Type: FieldsV1
fieldsV1:
f:metadata:
f:finalizers:
.:
v:"finalizer.keda.sh":
f:labels:
f:scaledobject.keda.sh/name:
f:status:
.:
f:conditions:
f:externalMetricNames:
f:lastActiveTime:
f:originalReplicaCount:
f:scaleTargetGVKR:
.:
f:group:
f:kind:
f:resource:
f:version:
f:scaleTargetKind:
Manager: keda
Operation: Update
Time: 2022-04-17T10:30:37Z
Resource Version: 1775052
UID: 3b6a68c1-c3b9-4cdf-b5d5-41a9721ac661
Spec:
Cooldown Period: 5
Max Replica Count: 20
Min Replica Count: 0
Polling Interval: 5
Scale Target Ref:
Name: eventlogger
Triggers:
Metadata:
Connection From Env: EventHubConnectionString
Storage Connection From Env: AzureWebJobsStorage
Type: azure-eventhub
Status:
Conditions:
Message: ScaledObject is defined correctly and is ready for scaling
Reason: ScaledObjectReady
Status: False
Type: Ready
Message: Scaling is performed because triggers are active
Reason: ScalerActive
Status: True
Type: Active
Status: Unknown
Type: Fallback
External Metric Names:
s0-azure-eventhub-$Default
Last Active Time: 2022-04-17T10:30:47Z
Original Replica Count: 1
Scale Target GVKR:
Group: apps
Kind: Deployment
Resource: deployments
Version: v1
Scale Target Kind: apps/v1.Deployment
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal KEDAScalersStarted 10s keda-operator Started scalers watch
Normal ScaledObjectReady 10s keda-operator ScaledObject is ready for scaling
So i'm a bit stucked as i don't see any errors but it's still not behaving as expected.
Versions:
Kubernetes version: 1.21.9
KEDA Version: 2.6.1 installed using kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.6.1/keda-2.6.1.yaml
Azure Functions using Java 11 and extensionBundle in host.json is configured using version [2.8.4, 3.0.0)
Was able to find a solution to the problem.
Event Hub triggered Azure Functions deployed on AKS show the same scaling characteristics as Azure Functions on App Service show:
You only get one consumer per partition to allow for ordering per partition.
This characteristic overrules maxReplicaCount in the Kubernetes Deployment Manifest.
So to solve my own issue: By increasing the partitions for Event Hub i get a pod per partition and KEDA scales the workload as expected.

KEDAScalerFailed : no azure identity found for request clientID

I tried various different methods but not able to access the Azure Storage Queues via PodIdentity. The resource group, client ID already exists.
The steps:-
kubectl create namespace keda
helm install keda kedacore/keda --set podIdentity.activeDirectory.identity= --namespace keda
kubectl create namespace myapp
The first few sections of myapp.yaml :
apiVersion: aadpodidentity.k8s.io/v1
kind: AzureIdentity
metadata:
name: <idvalue>
namespace: myapp
spec:
clientID: "<clientId>"
resourceID: "<resourceId>"
type: 0
---
apiVersion: aadpodidentity.k8s.io/v1
kind: AzureIdentityBinding
metadata:
name: <idvalue>-binding
namespace: myapp
spec:
azureIdentity: <idvalue>
selector: <idvalue> #keeping same as identity
---
The rest of the file is the deployment section, so not pasting here.
Then ran the Helm to deploy the myapp.yaml via myappInt.values.yaml file ->
helm install -f C:\MyApp\myappInt.values.yaml (this file contains the clustername, role etc.)
myappInt.values.yaml file:-
image:
registry: <registryname>
deployment:
environment: INT
clusterName: <clustername>
clusterRole: <clusterrole>
region: <region>
processingRegion: <processingregion>
azureIdentityClientId: "<clientId>"
azureIdentityResourceId: "<resourceId>"
Then the scaler ->
kubectl apply -f c:\MyApp\kedascaling.yaml --namespace myapp
The kedascaling.yaml:-
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-pod-identity-auth
spec:
podIdentity:
provider: azure
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: myapp-scaledobject
namespace: myapp
spec:
scaleTargetRef:
name: myapp # Corresponds with Deployment Name
minReplicaCount: 2
maxReplicaCount: 3
triggers:
- type: azure-queue
metadata:
queueName: myappqueue # Required
accountName: myappstorage # Required when pod identity is used
queueLength: "1" # Required
authenticationRef:
name: keda-pod-identity-auth # AuthenticationRef would need pod identity
Finally it gives the error below:-
kind: Event
apiVersion: v1
metadata:
name: myapp-scaledobject.16def024b939fdf2
namespace: myappnamespace
uid: someuid
resourceVersion: '186302648'
creationTimestamp: '2022-03-23T06:55:54Z'
managedFields:
- manager: keda
operation: Update
apiVersion: v1
time: '2022-03-23T06:55:54Z'
fieldsType: FieldsV1
fieldsV1:
f:count: {}
f:firstTimestamp: {}
f:involvedObject:
f:apiVersion: {}
f:kind: {}
f:name: {}
f:namespace: {}
f:resourceVersion: {}
f:uid: {}
f:lastTimestamp: {}
f:message: {}
f:reason: {}
f:source:
f:component: {}
f:type: {}
involvedObject:
kind: ScaledObject
namespace: myapp
name: myapp-scaledobject
uid: <some id>
apiVersion: keda.sh/v1alpha1
resourceVersion: '<some version>'
**reason: KEDAScalerFailed
message: |
no azure identity found for request clientID**
source:
component: keda-operator
firstTimestamp: '2022-03-23T06:55:54Z'
lastTimestamp: '2022-03-23T07:30:54Z'
count: 71
type: Warning
eventTime: null
reportingComponent: ''
reportingInstance: ''
Any idea what I am doing wrong here? Any help would be greatly appreciated. Asked at Keda repo but no response.
I had a similar error recently... I needed to make sure that the AAD Pod Identity was in the same namespace as the KEDA operator service.
Whatever identity you assigned to KEDA when creating KEDA with HELM, ensure that it's within the same namespace (which in your instance is "keda").
For example after running:
helm install keda kedacore/keda --set podIdentity.activeDirectory.identity=my-keda-identity --namespace keda
if my-keda-identity is not in namespace "keda" then the KEDA operator will not be able to bind AAD because it can't find it. If you need to update the AAD reference you can simply run:
helm upgrade keda kedacore/keda --set podIdentity.activeDirectory.identity=my-second-app-reference --namespace keda
Next, recreate the KEDA operator pod (I like to do this to test things out in a clean manor) and then run the following command to see if binding worked:
kubectl logs -n keda <keda-operator-pod-name> -c keda-operator
You should see the error go away (as long as the identity has access to retrieve queue messages from Azure Storage via RBAC)

Prometheus Slack alerts unknown field "slack_configs" in com.coreos.monitoring.v1alpha1.AlertmanagerConfig.spec.receivers

When configuring Slack Alerts in AlertmanagerConfig, I am getting following error (when releasing helm chart on Kubernetes cluster)
Error: UPGRADE FAILED: error validating "": error validating data:
ValidationError(AlertmanagerConfig.spec.receivers[0]): unknown field
"slack_configs" in
com.coreos.monitoring.v1alpha1.AlertmanagerConfig.spec.receivers
My alertmanagerconfig.yaml file looks as follows:
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: {{ template "theresa.fullname" . }}-alertmanager-config
labels:
alertmanagerConfig: email-notifications
spec:
route:
receiver: 'slack-email'
receivers:
- name: 'slack-email'
slack_configs:
- channel: '#cmr-orange-alerts'
api_url: ..
send_resolved: true
icon_url: ..
title: "{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}"
text: ..
You are trying to create a k8s resource of kind: AlertmanagerConfig but use the syntax of Prometheus config file, not a resource's syntax.
Check the syntax here:
https://docs.openshift.com/container-platform/4.7/rest_api/monitoring_apis/alertmanagerconfig-monitoring-coreos-com-v1alpha1.html

Nextcloud with Replicas on Azure Kubernetes - Failing to Mount Azure Files ReadWriteMany Volume

I'm trying to deploy Nextcloud w/HPA (replicas - horizontal scaling) on Azure Kubernetes with the official Nextcloud Helm chart and a ReadWriteMany volume created following these official instructions, but the volume never mounts, get this (or some version thereof) error:
kind: Event
apiVersion: v1
metadata:
name: nextcloud-6bc9b947bf-z6rlh.16bf7711bc2827a5
namespace: nextcloud
uid: c3c5619b-19da-4070-afbb-24bce111ddbe
resourceVersion: '55858'
creationTimestamp: '2021-12-10T18:08:27Z'
managedFields:
- manager: kubelet
operation: Update
apiVersion: v1
time: '2021-12-10T18:08:27Z'
fieldsType: FieldsV1
fieldsV1:
f:count: {}
f:firstTimestamp: {}
f:involvedObject: {}
f:lastTimestamp: {}
f:message: {}
f:reason: {}
f:source:
f:component: {}
f:host: {}
f:type: {}
involvedObject:
kind: Pod
namespace: nextcloud
name: nextcloud-6bc9b947bf-z6rlh
uid: 6106d13f-7033-4a4e-a6e9-a8e3947c52a4
apiVersion: v1
resourceVersion: '55764'
reason: FailedMount
message: >
MountVolume.MountDevice failed for volume "nextcloud-rwx" : rpc error: code =
Internal desc = volume(#azure-secret#aksshare#) mount
"//nextcloudcluster.file.core.windows.net/aksshare" on
"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/nextcloud-rwx/globalmount"
failed with mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t cifs -o
dir_mode=0777,file_mode=0777,gid=33,mfsymlinks,actimeo=30,<masked>
//nextcloudcluster.file.core.windows.net/aksshare
/var/lib/kubelet/plugins/kubernetes.io/csi/pv/nextcloud-rwx/globalmount
Output: mount error(13): Permission denied
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log
messages (dmesg)
source:
component: kubelet
host: aks-agentpool-16596208-vmss000002
firstTimestamp: '2021-12-10T18:08:27Z'
lastTimestamp: '2021-12-10T18:08:35Z'
count: 5
type: Warning
eventTime: null
reportingComponent: ''
reportingInstance: ''
Here is my PersistentVolume yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nextcloud-rwx
namespace: nextcloud
spec:
capacity:
storage: 32Gi
accessModes:
- ReadWriteMany
azureFile:
secretName: azure-secret
shareName: aksshare
readOnly: false
mountOptions:
- dir_mode=0777
- file_mode=0777
- gid=33
- mfsymlinks
PersistentVolumeClaim yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nextcloud-rwx
namespace: nextcloud
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 32Gi
I've also tried changing uid and gid to 0, 1000, etc, and get an even more egregious permission denied message because it doesn't "match the fsgroup(33)" (hence why I tried with gid=33).
Any ideas would be greatly appreciated! Thank you for your time.

Kubernetes CronJob Failure to start a job "Timeout" and "Job already exists"

I am trying to run a cronjob in kubernetes, but keep to having these two errors:
type: 'Warning' reason: 'FailedCreate' Error creating job: jobs.batch "dev-cron-1516702680" already exists
and
type: 'Warning' reason: 'FailedCreate' Error creating job: Timeout: request did not complete within allowed duration
Below are my cronjob yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
creationTimestamp: 2018-01-23T09:45:10Z
name: dev-cron
namespace: dev
resourceVersion: "16768201"
selfLink: /apis/batch/v1beta1/namespaces/dev/cronjobs/dev-cron
uid: 1a32eb94-0022-11e8-9256-065eb556d6a2
spec:
concurrencyPolicy: Allow
failedJobsHistoryLimit: 1
jobTemplate:
metadata:
creationTimestamp: null
spec:
template:
metadata:
creationTimestamp: null
spec:
containers:
- args:
- for country in th;
- do
- 'curl -X POST -d "{'footprint':'xxxx-xxxx'}"-H "Content-Type: application/json" https://dev.xxx.com/xxx/xxx'
- done
image: appropriate/curl:latest
imagePullPolicy: Always
name: cron
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Never
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
schedule: '* * * * *'
startingDeadlineSeconds: 10
successfulJobsHistoryLimit: 3
suspend: false
status: {}
I am not sure why this is keep happening. I am running Kubernetes version 1.9.1, in AWS cluster. Any idea why?
It turned out to happen because there is an auto injector by Istio initializer. Once I disabled Istio initializer injection for cronjobs, it works fine.

Resources