How to increase storage in YugabyteDB cluster deployed in k8s? - yugabytedb

[Question posted by a user on YugabyteDB Community Slack]
Can anyone suggest how to increase storage on my kube cluster, I have deployed with the official helm chart. And when I use the set command it says forbidden:
helm upgrade --set storage.master.size=100Gi,storage.tserver.size=100Gi yb1 yugabytedb/yugabyte --namespace yb
Error: UPGRADE FAILED: cannot patch "yb-master" with kind StatefulSet: StatefulSet.apps "yb-master" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden && cannot patch "yb-tserver" with kind StatefulSet: StatefulSet.apps "yb-tserver" is invalid: spec: Forbidden: updates to statefulset spec for fields other than 'replicas', 'template', 'updateStrategy' and 'minReadySeconds' are forbidden

As the helm error message says, we cannot update volume claim templates once created, this is how a k8s statefulset works. If the StorageClass allows volume expansion, then you can directly update the relevant pvcs’.
It's a manual process today as Statefulset doesn’t yet allow the volume expansion through its own manifest.
Get the content of the sts - kubectl get sts/<sts_name> -o yaml > out.yaml
Delete it without deleting the pods - kubectl delete sts/<sts_name> --cascade=orphan
Update the PVC with the required volume size
Recreate the statefulset with the updated volume size from the output of step-1
The sts manifest will take over the pods (through selectors) and update the storage spec without recreating.
You can track the status of the feature request here in kubernetes repo.

Related

Ingress in AKS does not have any active Endpoint

I have an AKS cluster. I install ingress with the command
helm upgrade --install --create-namespace ingress-nginx ingress-nginx/ingress-nginx --set controller.nodeSelector."beta\.kubernetes\.io/os"=linux --set defaultBackend.nodeSelector."beta\.kubernetes\.io/os"=linux --set controller.replicaCount=2 --set controller.service.loadBalancerIP=$IngressIP --namespace nginx-ingress --atomic
According to schedule I create a cluster, run tests and delete. I deploy the application using Helm charts. But since yesterday it stopped working. Although before half a year it worked without interruptions. For some reason I got errors in the nginx logs
Service "test-apis/test-load-api" does not have any active Endpoint.
All labels are present. And I can't understand what changed a day ago in ingress or AKS , what stopped working. Could you please help me. Thank you.
The error might be due to several reason. You can try with below solution.
in my application definition you might using name as my selector, Whereas in my service I was using app.After updating my service to use app.
Another situation when it may happen is when ingress class of the ingress controller does not match ingress class in the ingress resource manifest used for your services.
In our case, this was caused by having the ingress resource definition on a different namespace then the services.
You can refer this Stack thread to troubleshoot your issue.
Also this might be a bug with newer version of nginx-ingress-controller. You can also go through the troubleshooting steps given in the github discussion might works for you if still not please report a bug in github for the same.

AKS PersistentVolume Affinity?

Disclaimer: This question is very specific about the used platforms and the UseCase we are trying to solve with it. Also it compares two approaches we currently use at least in a development stage and are trying to compare, but perhaps don't fully understand yet. I am asking for guidance on this very specific topic...
A) We are running a Kafka cluster as Kafka Tasks on DC/OS, where persistence of data is maintained via local Disk Storage which is provisioned on the very same host as the according kafka broker instance.
B) We are trying to run Kafka on Kubernetes (via Strimzi Operator), specifically Azure Kubernetes Service (AKS) and are struggling to get reliable Data Persistence using the StorageClasses you get in AKS. We tried three possibilities:
(Default) Azure Disk
Azure File
emptyDir
I see two major issues with Azure Disk, as we are able to set the Kafka Pod Affinity in a manner that they do not end up on the same maintenance zone / host, we have no instrument to bind the according PersistentVolume anywhere near the Pod. There is nothing like NodeAffinity for AzureDisks. Also it is fairly common that an Azure Disk ends up on another host than its corresponding pod, which might be limited by network bandwidth then?
With Azure File we don't have issues because of maintenance zones which are going down temporarily, but as a high latency storage option it doesn't seem to be a good fit and also Kafka has trouble to delete / update files on retention.
So I ended up using an ephemeral Storage Cluster which is commonly NOT recommended but doesn't come with the problems above. The Volume "lives" near the pod and is available to it as long as the pod itself runs on any node. In the maintenance case pod AND volume die together. As long as I am able to maintain a quorum, I don't see where this might cause issues.
Is there anything like podAffinity for PersistentVolumes as Azure-Disk is per definition Node bound?
What are the major downsides in using emptyDir for persistence in a Kafka Cluster on Kubernetes?
Is there anything like podAffinity for PersistentVolumes as Azure-Disk
is per definition Node bound?
As I know, there is nothing like podaffinity for PersistentVolumes as Azure-Disk. The azure disk should be attached to the node, so if the pod changes the host node, then the pod can't use the volume on that disk. Only the Azure file share is podAffinity.
What are the major downsides in using emptyDir for persistence in a
Kafka Cluster on Kubernetes?
You can take a look at the emptyDir:
scratch space, such as for a disk-based merge sort
This is the most thing you need to watch out for when you use the AKS. You need to calculate the disk space, perhaps you need to attach multiple Azure disks to the nodes.
Starting off - I'm not sure what you mean about an Azure Disk ending up on a node other than where the pod is assigned - that shouldn't be possible, per my understanding (for completeness, you can do this on a VM with the shared disks feature outside of AKS, but as far as I'm aware that's not supported in AKS for dynamic disks at the time of writing). If you're looking at the volume.kubernetes.io/selected-node annotation on the PVC, I don't believe that's updated after initial creation.
You can reach the configuration you're looking for by using a statefulset with antiaffinity. Consider this statefulset. It creates three pods, which must be in different availability zones. I'm deploying this to an AKS cluster with a nodepool (nodepool2) with two nodes per AZ:
❯ kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{","}{.metadata.labels.topology\.kubernetes\.io\/zone}{"\n"}{end}'
aks-nodepool1-25997496-vmss000000,0
aks-nodepool2-25997496-vmss000000,westus2-1
aks-nodepool2-25997496-vmss000001,westus2-2
aks-nodepool2-25997496-vmss000002,westus2-3
aks-nodepool2-25997496-vmss000003,westus2-1
aks-nodepool2-25997496-vmss000004,westus2-2
aks-nodepool2-25997496-vmss000005,westus2-3
Once the statefulset is deployed and spun up, you can see each pod was assigned to one of the nodepool2 nodes:
❯ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
echo-0 1/1 Running 0 3m42s 10.48.36.102 aks-nodepool2-25997496-vmss000001 <none> <none>
echo-1 1/1 Running 0 3m19s 10.48.36.135 aks-nodepool2-25997496-vmss000002 <none> <none>
echo-2 1/1 Running 0 2m55s 10.48.36.72 aks-nodepool2-25997496-vmss000000 <none> <none>
Each pod created a PVC based on the template:
❯ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
demo-echo-0 Bound pvc-bf6104e0-c05e-43d4-9ec5-fae425998f9d 1Gi RWO managed-premium 25m
demo-echo-1 Bound pvc-9d9fbd5f-617a-4582-abc3-ca34b1b178e4 1Gi RWO managed-premium 25m
demo-echo-2 Bound pvc-d914a745-688f-493b-9b82-21598d4335ca 1Gi RWO managed-premium 24m
Let's take a look at one of the PVs that was created:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
pv.kubernetes.io/provisioned-by: kubernetes.io/azure-disk
volumehelper.VolumeDynamicallyCreatedByKey: azure-disk-dynamic-provisioner
creationTimestamp: "2021-04-05T14:08:12Z"
finalizers:
- kubernetes.io/pv-protection
labels:
failure-domain.beta.kubernetes.io/region: westus2
failure-domain.beta.kubernetes.io/zone: westus2-3
name: pvc-9d9fbd5f-617a-4582-abc3-ca34b1b178e4
resourceVersion: "19275047"
uid: 945ad69a-92cc-4d8d-96f4-bdf0b80f9965
spec:
accessModes:
- ReadWriteOnce
azureDisk:
cachingMode: ReadOnly
diskName: kubernetes-dynamic-pvc-9d9fbd5f-617a-4582-abc3-ca34b1b178e4
diskURI: /subscriptions/02a062c5-366a-4984-9788-d9241055dda2/resourceGroups/rg-sandbox-aks-mc-sandbox0-westus2/providers/Microsoft.Compute/disks/kubernetes-dynamic-pvc-9d9fbd5f-617a-4582-abc3-ca34b1b178e4
fsType: ""
kind: Managed
readOnly: false
capacity:
storage: 1Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: demo-echo-1
namespace: zonetest
resourceVersion: "19275017"
uid: 9d9fbd5f-617a-4582-abc3-ca34b1b178e4
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/region
operator: In
values:
- westus2
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- westus2-3
persistentVolumeReclaimPolicy: Delete
storageClassName: managed-premium
volumeMode: Filesystem
status:
phase: Bound
As you can see, that PV has a required nodeAffinity for nodes in failure-domain.beta.kubernetes.io/zone with value westus2-3. This ensures that the pod that owns that PV will only ever get placed on a node in westus2-3, and that PV will be bound to the node the disk is running on when the pod is started.
At this point, I deleted all the pods to get them on the other nodes:
❯ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
echo-0 1/1 Running 0 4m4s 10.48.36.168 aks-nodepool2-25997496-vmss000004 <none> <none>
echo-1 1/1 Running 0 3m30s 10.48.36.202 aks-nodepool2-25997496-vmss000005 <none> <none>
echo-2 1/1 Running 0 2m56s 10.48.36.42 aks-nodepool2-25997496-vmss000003 <none> <none>
There's no way to see it via Kubernetes, but you can see via the Azure portal that managed disk kubernetes-dynamic-pvc-bf6104e0-c05e-43d4-9ec5-fae425998f9d, which backs pv pvc-bf6104e0-c05e-43d4-9ec5-fae425998f9d, which backs PVC zonetest/demo-echo-0, is listed as Managed by: aks-nodepool2-25997496-vmss_4, so it's been removed and assigned to the node where the pod is running.
Portal screenshot showing disk attached to node 4
If I were to remove nodes such that I didn't have nodes in AZ 3, I wouldn't be able to start pod echo-1, since it's bound to a disk in AZ 3, which can't be attached to a node not in AZ 3.

`kubectl delete service` gets stuck in 'Terminating' state

I'm trying to delete a service I wrote & deployed to Azure Kubernetes Service (along with required Dask components that accompany it), and when I run kubectl delete -f my_manifest.yml, my service gets stuck in the Terminating state. The console tells me that it was deleted, but the command hangs:
> kubectl delete -f my-manifest.yaml
service "dask-scheduler" deleted
deployment.apps "dask-scheduler" deleted
deployment.apps "dask-worker" deleted
service "my-service" deleted
deployment.apps "my-deployment" deleted
I have to Ctrl+C this command. When I check my services, Dask has been successfully deleted, but my custom service hasn't. If I try to manually delete it, it similarly hangs/fails:
> kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP x.x.x.x <none> 443/TCP 18h
my-service LoadBalancer x.x.x.x x.x.x.x 80:30786/TCP,443:31934/TCP 18h
> kubectl delete service my-service
service "my-service" deleted
This question says to delete the pods first, but all my pods are deleted (kubectl get pods returns nothing). There's also this closed K8s issue that says --wait=false might fix foreground cascade deletion, but this doesn't work and doesn't seem to be the issue here anyway (as the pods themselves have already been deleted).
I assume that I can completely wipe out my AKS cluster and re-create, but that's an option of last resort here. I don't know whether it's relevant, but my service is using the azure-load-balancer-internal: "true" annotation for the service, and I have a webapp deployed to my VNet that uses this service.
Is there any other way to force shutdown this service?
Thanks to #4c74356b41's suggestion of looking at kubectl describe service my-service (which I hadn't considered for some reason), I saw this warning:
Code="LinkedAuthorizationFailed" Message="The client 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' with object id 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx' has permission to perform action 'Microsoft.Network/loadBalancers/write' on scope '/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.Network/loadBalancers/kubernetes-internal'; however, it does not have permission to perform action 'Microsoft.Network/virtualNetworks/subnets/join/action' on the linked scope(s) '/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.Network/virtualNetworks/<vnet>/subnets/<subnet>' or the linked scope(s) are invalid.
(The client and object id GUIDs are the same value.)
This indicated that it's not exactly a Kubernetes issue, but moreso permissions within the Azure ecosystem. I looked through the portal and didn't find that GUID in any of my users, groups, or apps, so I'm not sure what it's referring to. However, I granted the Owner role to this client id, and after a few minutes, the service deleted.
az role assignment create `
--role Owner `
--assignee xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
I had a similar issue with a svc not connecting to the pod cause the pod was already deleted:
HTTPConnectionPool(host='scv-name-not-shown-because-prod.namespace-prod', port=7999): Max retries exceeded with url:
my-url-not-shown-because-prod (Caused by
NewConnectionError('<urllib3.connection.HTTPConnection object at
0x7faee4b112b0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
I was able to solve this with the patch command:
kubectl patch service scv-name-not-shown-because-prod -n namespace-prod -p '{"metadata":{"finalizers":null}}'
I think the service went into some illegal state and was not able to ricover

Azure AKS 'Kube-Proxy' Kubernetes Node Log file location?

My question is 'probably' specific to Azure.
How can I review the Kube-Proxy logs?
After SSH'ing into an Azure AKS Node (done) I can use the following to view the Kubelet logs:
journalctl -u kubelet -o cat
Azure docs on the Azure Kubelet logs can be found here:
https://learn.microsoft.com/en-us/azure/aks/kubelet-logs
I have reviewed the following Kubernetes resource regarding logs but Kube-Proxy logs on Azure do not appear in any of the suggested locations on the AKS node:
https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#looking-at-logs
This is part of a trouble shooting effort related to a Kubernetes nGinx Ingress temporarily returning a '504 Gateway Time-out' when a service has not been accessed / going idle for some period of time (perhaps 5 to 10 minutes) but then becoming accessible on the next attempt(s).
On AKS, kube-proxy runs as a DaemonSet in the kube-system namespace
You can list the kube-proxy pods + node information with:
kubectl get pods -l component=kube-proxy -n kube-system -o wide
And then you can review the logs by running:
kubectl logs kube-proxy-<suffix> -n kube-system
On the same note as Acanthamoeba's answer the logs for the Kube-Proxy pod can also be accessed via the browse UI interface that can be launched via:
az aks browse --resource-group <ClusterResourceGroup> --name <ClusterName>
The above should pop open a new browser window pointed at the following URL: http://127.0.0.1:8001/#!/overview?namespace=default
Switch to Kube-System Namespace
Once the browser window is open, change to the Kube-System namespace, by selecting that option from the drop down on the left side:
Kube-System namespace is all the way at the bottom of the drop down... and probably requires scrolling.
Navigate to Pods
From there click "pods" (also on the left hand side menu, below the namespaces drop down) and then click the Kube-Proxy pod:
View Kube-Proxy Logs
Click to view logs of your Azure AKS based Kube-Proxy pod, logs button in is in the top right hand menu to the left of "Delete' and 'Edit' just below create:
Other Azure AKS Trouble Shooting Resources
Since you are trying to view the Kube-Proxy logs you are probably trouble shooting some networking issues or something along those lines. Here are some other resources that I used during my trouble shooting tour of my Azure AKS Cluster:
View Kubelet Logs on Azure AKS: https://learn.microsoft.com/en-us/azure/aks/kubelet-logs
nGinx Ingress Troubleshooting: https://github.com/kubernetes/ingress-nginx/blob/master/docs/troubleshooting.md
SSH into an Azure AKS Cluster VM: https://learn.microsoft.com/en-us/azure/aks/aks-ssh

How does PersistentVolume work with hostPath?

I have deployed Gitlab to my azure kubernetes cluster with a persistant storage defined the following way:
kind: PersistentVolume
apiVersion: v1
metadata:
name: gitlab-data
namespace: gitlab
spec:
capacity:
storage: 8Gi
accessModes:
- ReadWriteMany
hostPath:
path: "/tmp/gitlab-data"
That worked fine for some days. Suddenly all my data stored in Gitlab is gone and I don't know why. I was assuming that the hostPath defined PersistentVolumen is really persistent, because it is saved on a node and somehow replicated to all existing nodes. But my data now is lost and I cannot figure out why. I looked up the uptime of each node and there was no restart. I logged in into the nodes and checked the path and as far as I can see the data is gone.
So how do PersistentVolume Mounts work in Kubernetes? Are the data saved really persistent on the nodes? How do multiple nodes share the data, if a deployment is split to multiple nodes? Is hostPath reliable persistent storage?
hostPath doesn't share or replicate data between nodes and once your pod starts on another node, the data will be lost. You should consider to use some external shared storage.
Here's the related quote from the official docs:
HostPath (single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster)
from kubernetes.io/docs/user-guide/persistent-volumes/

Resources