Taking Thread dump/ Heap dump of Azure Kubernetes pods - azure

We are running our kafka stream application on Azure kubernetes written in java. We are new to kubernetes. To debug an issue we want to take thread dump of the running pod.
Below are the steps we are following to take the dump.
Building our application with below docker file.
FROM mcr.microsoft.com/java/jdk:11-zulu-alpine
RUN apk update && apk add --no-cache gcompat
RUN addgroup -S user1 && adduser -S user1 -G user1
USER user1
WORKDIR .
COPY target/my-application-1.0.0.0.jar .
Submitting the image with below deployment yaml file
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-application-v1.0.0.0
spec:
replicas: 1
selector:
matchLabels:
name: my-application-pod
app: my-application-app
template:
metadata:
name: my-application-pod
labels:
name: my-application-pod
app: my-application-app
spec:
nodeSelector:
agentpool: agentpool1
containers:
- name: my-application-0
image: myregistry.azurecr.io/my-application:v1.0.0.0
imagePullPolicy: Always
command: ["java","-jar","my-application-1.0.0.0.jar","input1","$(connection_string)"]
env:
- name: connection_string
valueFrom:
configMapKeyRef:
name: my-application-configmap
key: connectionString
resources:
limits:
cpu: "4"
requests:
cpu: "0.5"
To get a shell to a Running container you can run the command below:
kubectl exec -it <POD_NAME> -- sh
To get thread dump running below command
jstack PID > threadDump.tdump
but getting permission denied error
Can some one suggest how to solve this or steps to take thread/heap dumps.
Thanks in advance

Since you likely need the thread dump locally, you can bypass creating the file in the pod and just stream it directly to a file on your local computer:
kubectl exec -i POD_NAME -- jstack 1 > threadDump.tdump
If your thread dumps are large you may want to consider piping to pv first to get a nice progress bar.

Related

How to change the file-system watcher limit in Kubernetes (fs.inotify.max_user_watches)

I'm using pm2 to watch the directory holding the source code for my app-server's NodeJS program, running within a Kubernetes cluster.
However, I am getting this error:
ENOSPC: System limit for number of file watchers reached
I searched on that error, and found this answer: https://stackoverflow.com/a/55763478
# insert the new value into the system config
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
However, I tried running that in a pod on the target k8s node, and it says the command sudo was not found. If I remove the sudo, I get this error:
sysctl: setting key "fs.inotify.max_user_watches": Read-only file system
How can I modify the file-system watcher limit from the 8192 found on my Kubernetes node, to a higher value such as 524288?
I found a solution: use a privileged Daemon Set that runs on each node in the cluster, which has the ability to modify the fs.inotify.max_user_watches variable.
Add the following to a node-setup-daemon-set.yaml file, included in your Kubernetes cluster:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-setup
namespace: kube-system
labels:
k8s-app: node-setup
spec:
selector:
matchLabels:
name: node-setup
template:
metadata:
labels:
name: node-setup
spec:
containers:
- name: node-setup
image: ubuntu
command: ["/bin/sh","-c"]
args: ["/script/node-setup.sh; while true; do echo Sleeping && sleep 3600; done"]
env:
- name: PARTITION_NUMBER
valueFrom:
configMapKeyRef:
name: node-setup-config
key: partition_number
volumeMounts:
- name: node-setup-script
mountPath: /script
- name: dev
mountPath: /dev
- name: etc-lvm
mountPath: /etc/lvm
securityContext:
allowPrivilegeEscalation: true
privileged: true
volumes:
- name: node-setup-script
configMap:
name: node-setup-script
defaultMode: 0755
- name: dev
hostPath:
path: /dev
- name: etc-lvm
hostPath:
path: /etc/lvm
---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-setup-config
namespace: kube-system
data:
partition_number: "3"
---
apiVersion: v1
kind: ConfigMap
metadata:
name: node-setup-script
namespace: kube-system
data:
node-setup.sh: |
#!/bin/bash
set -e
# change the file-watcher max-count on each node to 524288
# insert the new value into the system config
sysctl -w fs.inotify.max_user_watches=524288
# check that the new value was applied
cat /proc/sys/fs/inotify/max_user_watches
Note: The file above could probably be simplified quite a bit. (I was basing it on this guide, and left in a lot of stuff that's probably not necessary for simply running the sysctl command.) If others succeed in trimming it further, while confirming that it still works, feel free to make/suggest those edits to my answer.
You do not want to run your container as a privileged container if you can help it.
The solution here is to set the following kernel parameters, then restart your container(s). The container(s) will use the variables from the kernel that your container is running within. This is because containers do not run separate kernels on Linux hosts (containers use the same kernel).
fs.inotify.max_user_watches=10485760
fs.aio-max-nr=10485760
fs.file-max=10485760
kernel.pid_max=10485760
kernel.threads-max=10485760
You should paste the above into: /etc/sysctl.conf.

Running Testcafe in Docker Containers in Kubernetes - 1337 Port is Already in Use - Error

I have multiple Testcafe scripts (script1.js, script2.js) that are working fine. I have Dockerized this code into a Dockerfile and it works fine when I run the Docker Image. Next, I want to invoke this Docker Image as a CronJob in Kubernetes. Given below is my manifest.yaml file.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: application-automation-framework
namespace: development
labels:
team: development
spec:
schedule: "*/1 * * * *"
jobTemplate:
metadata:
labels:
team: development
spec:
ttlSecondsAfterFinished: 120
backoffLimit: 3
template:
metadata:
labels:
team: development
spec:
containers:
- name: script1-job
image: testcafe-minikube
imagePullPolicy: Never
args: ["chromium:headless", "script1.js"]
- name: script2-job
image: testcafe-minikube
imagePullPolicy: Never
args: [ "chromium:headless", "script2.js"]
restartPolicy: OnFailure
As seen above, this manifest has two containers running. When I apply this manifest to Kubernetes, the first container (script1-job), runs well. But the second container (script2-job) gives me the following error.
ERROR The specified 1337 port is already in use by another program.
If I run this with one container, it works perfectly. I also tried changing the args of the containers to the following.
args: ["chromium:headless", "script1.js", "--ports 12345,12346"]
args: ["chromium:headless", "script2.js", "--ports 1234,1235"]
Still, I get the same error saying 1337 port already in use. (I wonder whether the --ports argument is working at all in Docker).
This is my Dockerfile for reference.
FROM testcafe/testcafe
COPY . ./
USER root
RUN npm install
Could someone please help me with this? I want to run multiple containers as Cronjobs in Kubernetes, where I can run multiple Testcafe scripts in each job invocation?
adding the containerPort configuration to your kubernetes resource should do the trick.
for example:
spec:
containers:
- name: script1-job
image: testcafe-minikube
imagePullPolicy: Never
args: ["chromium:headless", "script1.js", "--ports 12345,12346"]
ports:
- containerPort: 12346

Getting: bad option; for several filesystems (e.g. nfs, cifs) when trying to mount azure file share in K8 container

I created a Azure file share and I am able to connect to it using map network drive in my laptop having windows 10. I created a hello-world spring boot application with volume mount configurations for azure file share and trying to deploy in Kubernetes in docker-desktop. But my pod doesn't starts -
hello-world-9d7479c4d-26mv2 0/1 ContainerCreating 0 15s
Here is the error I can see in events when I describe the POD -
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 9h Successfully assigned default/hello-world-9d7479c4d-26mv2 to docker-desktop
Warning FailedMount 9h (x7 over 9h) kubelet, docker-desktop MountVolume.SetUp failed for volume "fileshare-pv" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t cifs -o file_mode=0777,dir_mode=0777,vers=3.0,<masked> //mystorage.file.core.windows.net/myshare /var/lib/kubelet/pods/425012d1-13ee-4c40-bf40-d2f7ccfe5954/volumes/kubernetes.io~azure-file/fileshare-pv
Output: mount: /var/lib/kubelet/pods/425012d1-13ee-4c40-bf40-d2f7ccfe5954/volumes/kubernetes.io~azure-file/fileshare-pv: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.
Then I updated my Dockerfile to install cifs-utils -
FROM ubuntu:16.04
# Install Java
RUN apt-get update && \
apt-get install -y openjdk-8-jdk && \
apt-get install -y ant && \
apt-get install -y cifs-utils && \
apt-get clean;
ENV PORT 8080
EXPOSE 8080
COPY target/*.jar /opt/app.jar
WORKDIR /opt
CMD ["java", "-jar", "app.jar"]
Still that error doesn't go. I googled a lot for solution but no luck. Is there any limitation in using azure file share with kubernates container in docker-desktop [windows machine]?
Here are my K8 configurations -
secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: storage-secret
namespace: default
type: Opaque
data:
azurestorageaccountname: BASE64-encoded-account-name
azurestorageaccountkey: BASE64-encoded-account-key
pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: fileshare-pv
labels:
usage: fileshare-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
azureFile:
secretName: storage-secret
shareName: myshare
readOnly: false
pvc.yaml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: fileshare-pvc
namespace: default
# Set this annotation to NOT let Kubernetes automatically create
# a persistent volume for this volume claim.
annotations:
volume.beta.kubernetes.io/storage-class: ""
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
selector:
# To make sure we match the claim with the exact volume, match the label
matchLabels:
usage: fileshare-pv
deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: hello-world
namespace: default
labels:
app: hello-world
spec:
replicas: 1
selector:
matchLabels:
app: hello-world
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-world-pod
image: 'hello-world-k8:1.0'
volumeMounts:
- name: azure
mountPath: /azureshare
ports:
- containerPort: 8080
volumes:
- name: azure
persistentVolumeClaim:
claimName: fileshare-pvc
---
apiVersion: v1
kind: Service
metadata:
name: hello-world-service
namespace: default
spec:
selector:
app: hello-world
ports:
- name: http
protocol: TCP
port: 8080
targetPort: 8080
type: LoadBalancer
You likely need to install a package that knows how to mount that file system. For NFS this may be nfs-common with Debian/Ubuntu.
sudo apt update && sudo apt install nfs-common -y
It happened on my ubuntu server 22.04 LTS machine. Use sudo apt install nfs-common or sudo apt install nfs-utils to resolve it.

Kubernetes - "Mount Volume Failed" when trying to deploy

I deployed my first container, I got info:
deployment.apps/frontarena-ads-deployment created
but then I saw my container creation is stuck in Waiting status.
Then I saw the logs using kubectl describe pod frontarena-ads-deployment-5b475667dd-gzmlp and saw MountVolume error which I cannot figure out why it is thrown:
Warning FailedMount 9m24s kubelet MountVolume.SetUp
failed for volume "ads-filesharevolume" : mount failed: exit status 32 Mounting command:
systemd-run Mounting arguments: --description=Kubernetes transient
mount for
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
--scope -- mount -t cifs -o username=frontarenastorage,password=mypassword,file_mode=0777,dir_mode=0777,vers=3.0
//frontarenastorage.file.core.windows.net/azurecontainershare
/var/lib/kubelet/pods/85aa3bfa-341a-4da1-b3de-fb1979420028/volumes/kubernetes.io~azure-file/ads-filesharevolume
Output: Running scope as unit
run-rf54d5b5f84854777956ae0e25810bb94.scope. mount error(115):
Operation now in progress Refer to the mount.cifs(8) manual page (e.g.
man mount.cifs)
Before I run the deployment I created a secret in Azure, using the already created azure file share, which I referenced within the YAML.
$AKS_PERS_STORAGE_ACCOUNT_NAME="frontarenastorage"
$STORAGE_KEY="mypassword"
kubectl create secret generic fa-fileshare-secret --from-literal=azurestorageaccountname=$AKS_PERS_STORAGE_ACCOUNT_NAME --from-literal=azurestorageaccountkey=$STORAGE_KEY
In that file share I have folders and files which I need to mount and I reference azurecontainershare in YAML:
My YAML looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: frontarena-ads-deployment
labels:
app: frontarena-ads-deployment
spec:
replicas: 1
template:
metadata:
name: frontarena-ads-aks-test
labels:
app: frontarena-ads-aks-test
spec:
containers:
- name: frontarena-ads-aks-test
image: faselect-docker.dev/frontarena/ads:test1
imagePullPolicy: Always
ports:
- containerPort: 9000
volumeMounts:
- name: ads-filesharevolume
mountPath: /opt/front/arena/host
volumes:
- name: ads-filesharevolume
azureFile:
secretName: fa-fileshare-secret
shareName: azurecontainershare
readOnly: false
imagePullSecrets:
- name: fa-repo-secret
selector:
matchLabels:
app: frontarena-ads-aks-test
The Issue was because of the different Azure Regions in which AKS cluster and Azure File Share are deployed. If they are in the same Region you would not have this issue.

applying changes to pod code source realtime - npm

I have reactjs app running on my pod and I have mounted source code from the host machine to the pod. It works fine but when I change my code in the host machine, pod source code also changes but when I run the site it has not affected the application. here is my manifest, what I'm doing wrong?
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: webapp
spec:
replicas: 1
minReadySeconds: 15
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: webapp
tier: frontend
phase: development
spec:
containers:
- name: webapp
image: xxxxxx
command:
- npm
args:
- run
- dev
env:
- name: environment
value: dev
- name: AUTHOR
value: webapp
ports:
- containerPort: 3000
volumeMounts:
- mountPath: /code
name: code
imagePullSecrets:
- name: regcred
volumes:
- name: code
hostPath:
path: /hosthome/xxxx/development/react-app/src
and i know for a fact npm is not watching my changes, how can i resolve it in pods?
Basically, you need to reload your application everytime you change your code and your pods don't reload or restart when you change the code under the /code directory. You will have to re-create your pod since you are using a deployment you can either:
kubectl delete <pod-where-your-app-is-running>
or
export PATCH='{"spec":{"template":{"metadata":{"annotations":{"timestamp":"'$(date)'"}}}}}'
kubectl patch deployment webapp -p "$PATCH"
Your pods should restart after that.
what Rico has mentioned is correct, you need to patch or rebuild with every changes, but you can avoid that by running minikube without vm-driver here is the command to run minikube without vm-driver only works in Linux, by doing this you can mount host path to pod. hope this will help
sudo minikube start --bootstrapper=localkube --vm-driver=none --apiserver-ips 127.0.0.1 --apiserver-name localhost -v=1

Resources