AKS using Kubernetes : not able to connect to cluster nodes once logged in to the cluster through azure-cli on Ubuntu - azure

I am getting issues when trying to getting the information about the nodes created using AKS(Azure Connected Service) for Kubernetes after the execution of creating the clusters and getting the credentials.
I am using the azure-cli on ubuntu linux machine.
Followed the Url for creation of clusters: https://learn.microsoft.com/en-us/azure/aks/kubernetes-walkthrough
I get the following error when using the command kubectl get nodes
after execution of connecting to cluster using
az aks get-credentials --resource-group <resource_group_name> --name <cluster_name>
Error:
kubectl get nodes
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get nodes)
I do get the same error when i use :
kubectl get pods -n kube-system -o=wide
When i connect back as another user by the following commands i.e.,
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
I will be able to retrieve the nodes i.e..,
kubectl get nodes
NAME STATUS ROLES AGE VERSION
<host-name> Ready master 20m v1.10.0
~$ kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE
etcd-actaz-prod-nb1 1/1 Running 0
kube-apiserver-actaz-prod-nb1 1/1 Running 0
kube-controller-manager-actaz-prod-nb1 1/1 Running 0
kube-dns-86f4d74b45-4qshc 3/3 Running 0
kube-flannel-ds-bld76 1/1 Running 0
kube-proxy-5s65r 1/1 Running 0
kube-scheduler-actaz-prod-nb1 1/1 Running 0
But this is actually overwriting newly clustered information from file $HOME/.kube/config
Am i missing something when we connect to AKS-cluster get-credentials command-let that's leading me to the error
*Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get nodes)*

After you
az aks get-credentials -n cluster-name -g resource-group
If should have merged to your local configuration:
/home/user-name/.kube/config
Can you check your config
kubectl config view
And check if it is pointing to the right cluster.

Assuming you have chosen default configuartion while deploying AKS. So You need to create SSH key pair to login to AKS Node.
Push above created public key to AKS node using "az vm user update" {plz take help to know what all switch you need to pass. It quite simple)
To create an SSH connection to an AKS node, you run a helper pod in your AKS cluster. This helper pod provides you with SSH access into the cluster and then additional SSH node access.
To create and use this helper pod, complete the following steps:
- Run a debian (or any other container like centos7 etc) container image and attach a terminal session to it. This container can be used to create an SSH session with any node in the AKS cluster:
kubectl run -it --rm aks-ssh --image=debian
The base Debian image doesn't include SSH components.
apt-get update && apt-get install openssh-client -y
Copy private key (the one you created in the begining to pod) using kubelet cmd. kubelet toolkit must be present on your machine from where you created ssh pair.
kubectl cp :/
Now you will see private key file on your container location, change the private key permission to 600 and now able to ssh your AKS node
Hope this helps.

Related

Kubernetes many restarts but pod keeps running

I'm seeing a lot of restarts on all the pods of every service that I have deployed on Kubernetes.
But when I see the logs in real time:
kubectl -n my-namespace logs -c my-pod -f my-pod-some-hash --tail=50
I see nothing, there's no restarts, there's no signal of failure. Readiness keep workings. So what it means all those restarts? Where or how can I get more info about those restarts?
Edit:
By viewing the pod details of the pod that has 158 on the picture above, I can see this, but I don't know what it means or if it's related to the restarts:
Replication via one sample example pod with CLI commands
If any pod restarts, in order to check the logs of the previous run user "--previous"
Step1:
Connect to cluster using below command
az aks get-credentials --resource-group <resourcegroupname> --name <Clustername>
Step2:
verify the pod logs
kubectl get pods
Step3:
Verify the restart pods logs using command
kubectl logs <PodName> --previous

Could not get apiVersions from Kubernetes: Unable to retrieve the complete list of server APIs

While trying to deploy an application got an error as below:
Error: UPGRADE FAILED: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: metrics.k8s.io/v1beta1: the server is currently unable to handle the request
Output of kubectl api-resources consists some resources along with the same error in the end.
Environment: Azure Cloud, AKS Service
Solution:
The steps I followed are:
kubectl get apiservices : If metric-server service is down with the error CrashLoopBackOff try to follow the step 2 otherwise just try to restart the metric-server service using kubectl delete apiservice/"service_name". For me it was v1beta1.metrics.k8s.io .
kubectl get pods -n kube-system and found out that pods like metrics-server, kubernetes-dashboard are down because of the main coreDNS pod was down.
For me it was:
NAME READY STATUS RESTARTS AGE
pod/coredns-85577b65b-zj2x2 0/1 CrashLoopBackOff 7 13m
Use kubectl describe pod/"pod_name" to check the error in coreDNS pod and if it is down because of /etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy, then we need to use forward instead of proxy in the yaml file where coreDNS config is there. Because CoreDNS version 1.5x used by the image does not support the proxy keyword anymore.
This error happens commonly when your metrics server pod is not reachable by the master node. Possible reasons are
metric-server pod is not running. This is the first thing you should check. Then look at the logs of the metric-server pod to check if it has some permission issues trying to get metrics
Try to confirm communication between master and slave nodes.
Try running kubectl top nodes and kubectl top pods -A to see if metric-server runs ok.
From these points you can proceed further.

exec user process caused "exec format error" during setup

I'm trying to install haproxy-ingress under Kubernetes ver 1.18 (hosted on raspberry pi).
The master node has been correctly labeled with role=ingress-controller.
The kubectl create works also fine:
# kubectl create -f https://haproxy-ingress.github.io/resources/haproxy-ingress.yaml
namespace/ingress-controller created
serviceaccount/ingress-controller created
clusterrole.rbac.authorization.k8s.io/ingress-controller created
role.rbac.authorization.k8s.io/ingress-controller created
clusterrolebinding.rbac.authorization.k8s.io/ingress-controller created
rolebinding.rbac.authorization.k8s.io/ingress-controller created
configmap/haproxy-ingress created
daemonset.apps/haproxy-ingress created
But then, the pod is in crash loop:
# kubectl get pods -n ingress-controller -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
haproxy-ingress-dpcvc 0/1 CrashLoopBackOff 1 30s 192.168.1.101 purple.cloudlet <none> <none>
And the logs shows that error:
# kubectl logs haproxy-ingress-dpcvc -n ingress-controller
standard_init_linux.go:211: exec user process caused "exec format error"
Does anyone experience something similar? Can this be related to the arm (32-bit) architecture of the raspbian that I'm using?
raspberry pi's run arm architectures which unfortunately are not supported by haproxy-ingress.

How to use cloud shell SSH into AKS Cluster and test the connection from AKS inside

Our company blocks the ssh port. How to use cloud shell to ssh into an AKS cluster, so we can curl from there to an external URL to test the connection? Tks.
this wouldn't really make a lot of sense, but you'd need to just open up your ssh ports to the azure region your cloudshell is in (determined by your storage, i suppose).
But a better way would be to just do:
kubectl exec -it -n pod_namespace podname /bin/bash (or /bin/sh)
this would open up a bash session on the pod on your AKS and you'd be able to test your curl requests.
For your requirements, you can use pod in the AKS cluster as a jump box, and then ssh the AKS cluster nodes inside the pod.
Steps here:
Get the nodes IP:
kubectl get nodes -o wide
Create a pod in the AKS cluster and create a bash session with the pod:
kubectl run --generator=run-pod/v1 -it --rm aks-ssh --image=debian
Install ssh client inside the pod:
apt-get update && apt-get install openssh-client -y
Copy ssh key that used when you create the AKS cluster to the pod:
kubectl cp ~/.ssh/id_rsa $(kubectl get pod -l run=aks-ssh -o jsonpath='{.items[0].metadata.name}'):/id_rsa
Or use the password, if you forget it, you can find the AKS nodes and reset the password.
Choose one node to SSH it:
ssh -i id_rsa azureuser#node_Ip
For more details, see Create the SSH connection to the AKS cluster nodes.

Docker Registry Stays Pending After Deployment

I have installed OpenShift Enterprise as per the online guide (quick installation) but I'm stuck at deploying the registry.
[https://docs.openshift.com/enterprise/3.0/admin_guide/install/docker_registry.html#deploy-registry][1]
I create the registry
oadm registry --config=/etc/openshift/master/admin.kubeconfig \
--credentials=/etc/openshift/master/openshift-registry.kubeconfig \
--images='registry.access.redhat.com/openshift3/ose-${component}:${version}'
I check that it was configured
[justin#172 ~]$ oc get se docker-registry
NAME LABELS SELECTOR IP(S) PORT(S)
docker-registry docker-registry=default docker-registry=default 172.30.144.220 5000/TCP
But it never runs it stays pending
[justin#172 ~]$ oc get pods
NAME READY STATUS RESTARTS AGE
docker-registry-1-deploy 0/1 Pending 0 2h
I try to get some more info
[justin#172 ~]$ oc logs docker-registry-1-deploy
[justin#172 ~]$
but the logs command returns nothing
I had attempted an install with one node sharing the machine with the master.
My nodes looked like this:
[root#master ~]# oc get nodes
NAME LABELS STATUS
master.mydomain.com kubernetes.io/hostname=master.mydomain.com Ready,SchedulingDisabled
Note: SchedulingDisabled
I ran this command:
oc describe pod docker-registry-1-deploy
And it gave the reason for not being deployed which was that there were no nodes to schedule a deployment on. Just to get things going quickly I performed the install again added a node on another VM.
Then
[root#master ~]# oc get nodes
NAME LABELS STATUS
master.mydomain.com kubernetes.io/hostname=master.mydomain.com Ready,SchedulingDisabled
node1.mydomain.com kubernetes.io/hostname=node1.mydomain.com Ready
and I managed to successfully deploy the registry.

Resources